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We have developed a new type of self-consistent scheme within the GW approximation, which we 
call quasiparticle self-consistent GW (QSGIV). We have shown that QSGWdescribes energy bands 
for a wide-range of materials rather well, including many where the local-density approximation 
fails. QSGW contains physical effects found in other theories such as LDA-|-f/, SIC and GW 
in a satisfactory manner without many of their drawbacks (partitioning of itinerant and localized 
electrons, adjustable parameters, ambiguities in double-counting, etc.). We present some theoretical 
discussion concerning the formulation of QSGW , including a prescription for calculating the total 
energy. We also address several key methodological points needed for implementation. We then 
show convergence checks and some representative results in a variety of materials. 

PACS numbers: 71.15.Qe,71.10.-w,71.20.-b 



o 
> 

(N 
O 

o 



o 



I 

o 
o 



In the 1980's, algorithmic developments and faster 
computers made it possible to apply Hedin's GW ap- 
proximation {GWA) [l| to real materials 0, 01- Espe- 
cially, Hybcrtscn and Louie 0] first implemented the 
GWK within an ab-initio framework in a satisfactory 
manner. Theirs was a perturbation treatment start- 
ing from the Kohn-Sham cigcnfunctions and eigenval- 
ues given in the local density approximation (LDA) to 
density functional theory (DFT)jl, Q. We will denote 
this approach here as Ishot-GVF. Until now Ishot-GVF 
has been applied to variety of materials, usually in con- 
junction with the pseudopotcntial (PP) approximation. 
Quasiparticle (QP) energies so obtained arc in signifi- 
cantly better agreement with experiments than the LDA 
Kohn-Sham eigenvalues Q • 

However, we have recently shown that Ishot-GM^ has 
many significant failings. Even in simple semiconductors 
it systematically underestimates optical gaps [13, Ell- 
in general, the quality of results are closely tied to the 
quality of the LDA starting point. For more complicated 
cases where the LDA cigcnfunctions arc poor, Ishot-GW 
can fail even qualitatively 

A possible way to overcome this difficulty is to deter- 
mine the starting point self-consistently. The effects of 
the eigenvalue-only self-consistency (keeping the cigcn- 
functions as given in LDA), was discussed by Surh, Louie, 
and Cohen Recently, Luo, Ismail-Beigi, Cohen, and 
Louie [Tsj applied it to ZnS and ZnSe, where they showed 
that the band gaps of Ishot-GW^ 3.19 eV and 2.32 cV 
for ZnS and ZnSe are increased to 3.64 eV and 2.41 cV 
by the eigenvalue-only self-consistency (sec Table IIVI 
also). The differences suggest the importance of this self- 
consistency. Furthermore, for ZnSe, the value 2.41 eV 
changes to 2.69 eV when they use cigcnfunctions given by 
generalized gradient approximation (GGA). This differ- 
ence suggests that we may need to look for a means to de- 
termine optimum cigcnfunctions for GWA. Aryasetiawan 



and Gunnarsson applied another kind of self-consistent 
scheme to NiO [1J|- They introduced a parameter for 
the non-local potential which affects the unoccupied eg 
level, and made it self-consistent. They showed that the 
band gap of Ishot-GVF is ~1 eV, and that it is improved 
to ~5.5 eV by the self-consistency. 

Based on these self-consistency ideas, we have devel- 
oped a new ab-initio approach to GW [l^, [H, [13, [111, 
which we now call "quasiparticle self-consistent GW" 
(QSGW) method. QSGW is a first-principles method 
that stays within the framework of Hedin's GWA, that is, 
QSGW is a perturbation theory built around some non- 
interacting Hamiltonian. It does not depend on the LDA 
anymore but rather determines the optimum nonintcr- 
acting Hamiltonian in a self-consistent manner. We have 
shown that QSGFF satisfactorily describes QP energies 
for a wide range of materials. Bruneval, Vast and Rein- 
ing [l^ implemented it in the pseudopotcntial scheme, 
and gave some kinds of analysis including the comparison 
with the Hartree-Fock method and with the Coulomb- 
hole and Screened exchange (COHSEX) methods. 

The present paper begins with a derivation of the fun- 
damental equation of QSGW, and some theoretical dis- 
cussion concerning it (Sec.|I|. The fundamental equation 
is derived from the idea of a self-consistent perturbation. 
We also present a means for computing the total energy 
through the adiabatic connection formalism. Next, we 
detail a number of key methodological points (Sec. [TT|) . 
The present implementation is unique in that it makes 
no pseudopotcntial or shape approximation to the poten- 
tial, and it uses a mixed basis for the response function. 
Coulomb interaction, and self-energy, which enables us 
to properly treat core states. The GWA methodology is 
presented along with some additional points particular to 
self-consistency. In Sec. IIIIl we show some convergence 
checks, using GaAs as a representative system. Then we 
show how QSGW^ works by comparing it to other kinds 
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of GWA for compounds representative of different ma- 
terials classes: semiconductors C, Si, SiC, GaAs, ZnS, 
and ZnSe; oxide semiconductors ZnO and CU2O; transi- 
tion metal monoxides MnO and NiO; transition metals 
Fe and Ni. 



I. THEORY 
A. GWA 

Let US summarize the GWA P, 0] for later discussion. 
Here wc omit spin index for simplicity. Generally speak- 
ing, we can perform GWA from some given one-body 
Hamiltonian written as 
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+ V^°*(r,r'). 



(1) 



The one-particle effective potential V'^^{r, r') can be non- 
local, though it is local, i.e. F'=^(r,r') = y^*^(r)(5(r - r') 
when generated by the usual Kohn-Sham construction. 

determines the set of eigenvalues {si} and eigenfunc- 
tions {\I'i(r)}. From them we can construct the non- 
interacting Green's function G° as 



G° (r,r',cj) = 



^ uj — Ei ± i6 



denotes the dielectric function. As seen in e.g., works 
by Alouani and co-workers [13, [2l| , W calculated from a 
reasonable H° should be in good agreement with experi- 
ments, even if W does not satisfy the /-sum rule because 
is non-local [l^l (because of the so-called scissors op- 
erator). 

Hedin's GWA gives the self-energy S(r,r(ti;) as 



E(r,r» 
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duj'G"{r, r',oj~ oj')W{r, r', oj')e 



-iSu. 



(4) 



From this self-energy, the external potential V'^^*^ from 
the nuclei, and the Hartree potential which is calcu- 
lated from the electron density through G°, we obtain an 
w-dependent one-body effective potential {uj) : 



(5) 



Note that is detemined from the density which is 
calculated for the non-interacting system specified by 
H'^. For simplicity wc omit arguments (r,r'). Then the 
one-body Green function is given as G = l/(— V^2m -I- 
V'-^^{uj)). and are local and w-independent 

potentials. Thus the GH^A maps to V^^{uj). In 
other words, the GW^A generates a perturbativc correc- 
(2) tion AV{uj) to the one-particle potential , written 



where —iS is for occupied states, and +i6 for unoccupied 
states. Within the RPA (random-phase approximation), 
the screened Coulomb interaction is 



W ^er^v = (1- wH)"^- 



(3) 



where H = — iG" x G° is the proper polarization function, 
and v{r, r') = ^ 



is the bare Coulomb interaction, e 



AV{ij)^V^^{ij)-V 



rcS 



(6) 



V^^{uj) and AV{iLj) can be regarded as functionals of 

yeS (qj. ^0) 

In the standard Ishot-GVF with generated by the 
LDA, V"^ is the LDA Kohn-Sham Hamiltonian. Neglect- 
ing off-diagonal terms, the QP energy (QPE) is 



Ekn = £k„ + ^k«[(«'k«|S(r,r',ek„)l*k«) - (*kn|K'^c°^(i-)|*kn) 
where Z^n is the QP renormalization factor: 

l-(*k„|T^S(r, r',ekn)|*k. 

OUJ 



(7) 



(8) 
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Subscripts label the wave vector k and band index n. 
We will write them later as a compound index, i = (k, n) . 
Eq. ((Tj) is the customary way QPEs are calculated in GW. 
However, as we discussed in Ref. [ij], using Z=l instead 
of Eq. ([HI) is usually a better approximation; see also 
Sec. mil Chapter 7 of Ref. [l^] presents another analysis 
where Z=l is shown to be a better approximation, in 
the context of the Frolich Hamiltonian. In any case, we 
have to calculate matrix elements |S(r, r', u;)!^'^^) 



as accurately and as efficiently as possible (off-diagonal 
elements are necessary in the QSG Urease, as explained 
below). 

As we showed in Ref. [ll| , 7?" generated by LDA is not 
necessarily a good approximation. [Even the i?" for "true 
Kohn-Sham" Hamiltonian in DFT can be a poor descrip- 
tor of QP excitation energies [1^.] For example, time- 
reversal symmetry is automatically enforced because V^^ 
is local (and thus real). This symmetry is strongly vio- 
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lated in open /-shell systems [18[. The bandgap of a 
relatively simple III-V semiconductor, InN, is close to 
zero [1, 1^ ; also the QP spectrum of NiO is little im- 
proved over LD A [31 ■ A variety of other examples could 
be cited where GWA starting from = iJ^DA ^ pppr 



approximation. (In contrast, see Sec. Illll and Rcf. [l6|J to 
see how QSGW^ gives consistently good agreement with 
experiment.) 



B. Quasiparticle self-consistent GW 

QS GW is a formalism which determines (or H'^) 
self-consistently within the GWA, without depending 
on LDA or DFT. If we have a mapping procedure 
V'~^^ {uj) we can close the equation to deter- 

mine V^^ , i.e. determine V^^ self-consistently by V^^ — > 
V^^{uj) ^ .... The main idea to determine the 

mapping is grounded in the concept of the QP. Roughly 
speaking, V^^ is determined so as to reproduce the QP 
generated from V'~^^{uj). In the following, we explain 
how to determine this V'~^^ (uj) — > V^^ , and derive the 
fundamental QSGW' equation [ll, [H, [ij, [ll] . 

Based on Landau's QP picture, there are fundamen- 
tal one-particle-like excitations denoted as quasiparticles 
(QP), at least around the Fermi energy Ep. The QPEs 
and QP eigenfunctions (QPeigs), {£'i,$i(r)}, are given 
as Q 
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Re[S(S,)] - E, 



|$,)=0. (9) 



We refer to the states characterized by these Ei and $i(r) 
as the dressed QP. Here Re [AT] means just take the her- 
mitian part of X so Ei is real for E^. This is irrelevant 
around Ep because the anti-hermitian part of S(-Bi) goes 
to zero as Ei —f Ep. On the other hand, wc have another 
one-particle picture described by H'^; wc name these QPs 
as bare QPs, and refer to the QPEs and eigenfunctions 
corresponding to as {si, 5'i(r)}. 

Let us consider the difference and the relation of these 
two kinds of QP. The bare QP is essentially consistent 
with the Landau-Silin QP picture, discussed by, e.g.. 
Pines and Nozieres in Sec. 3.3, Ref. The bare QP 

interact with each other via the bare Coulomb interac- 
tion. The bare QP given by evolve into the dressed 
QP when the interaction H — H'^ is turned on adiabat- 
ically. Here H is the total Hamiltonian (See Eq. (|12p ): 
and the hat signifies that H is written in second quan- 
tized form. and i?" are equivalent. The dressed QP 
consists of the central bare QP and an induced polar- 
ization cloud consisting of other bare QP ; this view is 



compatible with the way interactions are treated in the 
GWA. 

generating the bare QPs represents a virtual ref- 
erence system just for theoretical convenience. There 
is an ambiguity in how to determine iJ"; in principle, 
any can be used if H — H'^ could be completely in- 
cluded. However, as we evaluate the difference H — 
in some perturbation method like GW A, we must utilize 
some optimum (or best) should be chosen so 

that the perturbative contribution is as small as possi- 
ble. A key point remains in how to define a measure of 
the size of the perturbation. We can classify our QSGH^ 
method as a self-consistent perturbation method which 
self-consistently determines the optinum division of H 
into the main part and the residual part H — H'^. 
There are various possible choices for the measure; how- 
ever, here we take a simple way, by requiring that the 
two kinds of QPs discussed in the previous paragraphs 
correspond as closely as possible. We choose so as to 
repoducc the dressed QPs. In other words, we assign the 
difference of the QPeig (and also the QPE) between the 
bare QP and the dressed QP as the measure, and then we 
minimize it. From the physical point of view, this means 
that the motion of the central electron of the dressed QP 
is not changed by H — Note that H — contains 
two kinds of contributions: not only the Coulomb inter- 
action but also the one-body term T^™* — . The latter 
gives a counter contribution that cancel changes caused 
by the Coulomb interaction. 

We now explain how to obtain an expression in prac- 
tice. Suppose that self-consistency has been somehow 
attained. Then wc have w {i?i,<f>i} around Ep. 

{'^i} is a complete set because they come from some 
though the are not. Then we can expand 

Re[S(e,)]|5',) (« Re[E(£;,)]|$,)) in {£i,*i} as 

Rc[E(£,)]|*^) = l*j)Rcp(£.)]j^, 

where Re^{uj%j = (*j|Re[i;(cj)]|*j). Then we intro- 
duce an energy-independent operator R defined as 



i?-^|*,)Rc[I](e,)b-.(*.|, 



which satisfies RY^i) = Re[S(ei)]|'I'i). Thus we can use 
this R instead of Re[I](£'i)] in Eq. however, R is not 
hermitian thus we take only the hermitian part of R as 
V"^ = Re[i?]; 



J 



\ I*.) {Mne^)h + Rc[E(£,)]y} mode- A 



(10) 
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for the calculation of {Ei,^i} (w {£i,\fi}) in Eq. 
Thus we have obtained a mapping V'^^ (uj) — > 

^cff. given we can calculate V^^ in Eq. pU|) 
through E(tj) in the GM^A. With this T^'''^ together with 
V^, which is calculated from the density for G° (or 
we have a new V^^ . The QSGW^ cycle determines all 
H^, T^®*, W and G self-consistcntly. As shown in Sec. IIIII 
and also in Refs. [i5i,I6„IZ], QSGW^ systematically over- 
estimates semiconductor band gaps a little, while the di- 
electric constant Coo is slightly too small \W^ . 

It is possible to derive Eq. pO|) in a straightforward 
manner from a norm-functional formalism. We first 
define a positive-definite norm functional M(F"^*^) = 
Tr[Re[Z\V(w)]/9Re[Z\y(ci;)]] to measure the size of per- 
tubative contribution. Here the weight function p = 
S{lu ~ iJ") defines the measure; Tr is for space, spin and 
u!. For fixed p, this M{V°^) is treated as a functional 



of V^^ because V^^ determines AV{uj) through Eq. ^ 
in the GW^A. As MiV^) = J2j.i\i'^^\^4^'^^ i^j)] ~ 
V'^^\^j)\'^ , we can show its minimum occurs when 
Eq. (|10[) is satisfied in a straightforward manner. This 
minimization formalism clearly shows that QSGFK deter- 
mines for a given V^^*'; in addition, it will be useful 
for formal discussions of conservation laws and so on. 
The discussion in this paragraph is similar to that given 
in Ref. [l^, though we use a slightly different AI{V''^). 

Eq. ([TU)) is derived from the requirement so that 
{e^jVPi} « {Ei,^i} around Ep. This condition does not 
necessarily determine V^'^ uniquely. It is instructive to 
evaluate how results change when alternative ways are 
used to determine V^^ . In Ref. [l^ we tested the follow- 
ing: 



=IIlV'ORe[S(£,)]"(V'»|, + $]|^^)Re[I](^F)],,(^j|, mode-B (11) 

I 



In this form (which we denote as 'mode-B'). the off- 
diagonal elements are evaluated at Ep. The diagonal 
parts of Eq. (fTTj) and Eq. pO|) are the same. As noted 
in Ref. and as discussed in Sec. Illli Eqs. pi7)) and 
(jlip yield rather similar results, though we have found 
that mode-A results compare to experiment in the most 
systematic way. 

As the self-consistency through Eq. PU)) (or Eq. ([TT]) ') 
results in {£^,^1'^} « {Ei,^i}, we can attribute phys- 
ical meaning to bare QP: we can use the bare QP 
in the independent-particle approximation [25l |. when, 
for example, modeling transport within the Boltzmann- 
equation [2y|. It will be possible to calculate scatter- 
ing rates between bare QP given by H'^, through cal- 
culation of various matrix elements (electron-electron, 
electron-phonon, and so on). The adiabatic connection 
path from iJ*^ to H used in QS G is better than the 
path in the Kohn-Sham theory where the eigenfunction of 
Hks (Kohn-Sham Hamiltonian) evolves into the dressed 
QP. Physical quantities along the path starting from Hks 
may not be very stable. For example, the band gap can 
change very much along the path (it can change from 
metal to insulator in some cases, e.g. in Ge and InN [ll|; 
QSGW is free from this problem [3), even if it keeps 
the density along the path. [Note: Pines and Nozieres 
(Ref. [1^, Sec. 1.6) use the terms 'bare QP' and 'dressed 
QP' differently than what is meant here. They refer to 
eigenstates of H as 'bare QP.' and spatially localized QP 
as 'dressed QP' in the neutral Fermi liquid.] 

From a theoretical point of view, the fully sc GW 
[27l . [28j looks reasonable because it is derived from the 
Luttinger-Ward functional E[G]. This apparently keeps 



the symmetry of G, that is, E[G] = E[JZ[G]] where 7^[G] 
denotes some G G mapping (any symmetry in Hamil- 
tonian, e.g. time translation and gauge transformation); 
this clearly results in the conservation laws for external 
perturbations [2^ because of Noether's theorem (exactly 
speaking, we need to start from the effective action for- 
malism for the dynamics of G [30|). However, it contains 
serious problems in practice. For example, fully sc GW 
uses W from H = ~iG x G; this includes electron-hole 
excitations in its intermediate states with the weight of 
the product of renormalization factors Z x Z. This is in- 
consistent with the expectation of the Landau-Silin QP 
picture [3, |3l|. In fact, as we discuss in Appendix [XI 
the effects of Z factor included in G are well canceled 
because of the contribution from the vertex; Bechstedt 
et al. showed the Z-factor cancellation by a practical 
calculation at the lowest order (3l| . In principle, such 
a deficiency should be recovered by the inclusion of the 
contribution from the vertex; however, we expect that 
such expansion series should be not efficient. 

Generally speaking, perturbation theories in the 
dressed Green's function G (as in Luttinger-Ward func- 
tional) can be very problematic because G contains two 
different kinds of physical quantities to intermediate 
states: the QP part (suppressed by the factor Z) and 
the incoherent part (e.g. plasmon- related satellites). In- 
cluding the sum of ladder diagrams into H via the Bethe- 
Salpeter equation, should be a poorer approximation if 
G is used instead of G", because the one-particle part is 
suppressed by Z factors; also the contribution from the 
incoherent part can give physically unclear contributions. 
The same can be said about the T- matrix treatment [33 | . 
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Such methods have clear physical interpretation in a QP 
framework, i.e. when the expansion is through G^. A 
similar problem is encountered in theories such as "dy- 
namical mean field theory" +GW [1^, where the local 
part of the proper polarization function is replaced with 
a "better" function which is obtained with the Anderson 
impurity model. This question, whether the perturba- 
tion should be based on G, or on G^, also appeared when 
Hedin obtained an equation to determine the Landau QP 
parameters; See Eq. (26.12) in Ref. [l|. 

As we will show in Sec. Ill] (see Ref. ^ also), QSGW 
systematically overestimates band gaps, consistent with 
systematic underestimation of Coo • This looks reasonable 
because W does not include the electron-hole correlation 
within the RPA. Its inclusion would effectively reduce the 
pair excitation energy in its intermediate states. If we do 
include such kind of correlation for W at the level of the 
Bethe-Salpeter equation, we will have an improved ver- 
sion of QSGW^. However, the QPE obtained from G°W 
with such a W corresponds to the F = 1 approximation, 
from the perspective of the E = G'^WT approximation, 
as used by Mahan and Sernelius [U; the contribution 
from r is neglected. In order to include the contribution 
properly, we need to use the self-energy derived from 
the functional derivative of Ec as shown in Eq. (^1]) in 
next section, where we need to include the proper polar- 
ization Hx which includes such Bcthc-Salpeter contribu- 
tions; then we can include the corresponging F. It looks 
complicated, but it will be relatively easy to evaluate just 
the shift of QPE with neglecting the change of QPeig; we 
just have to evaluate the change of Ec numerically, when 
we add (or remove) an electron to Gq. However, numeri- 

I 



where |0a) is the ground state for /f^. We define 
E"^""^ = dX{0x\V''''^\0x). This path is different from 
the path used in DFT, where we take a path starting 
from Hks to H while keeping the given density fixed. 
Along the path of the adiabatic connection, the Green's 
function changes from G'' to G. Because of our minimum- 
perturbation construction, Eq. p^ . the QP parts (QPeig 
and QPE) contained in G are well kept by G°. If 
71a (r) = (OA|'T-(r)|OA) along the path is almost the same as 
nx=o (r) , E^ plus the second term in the RHS of Eq. (fTBl) 
is reduced to {Ox=o\H^ + V''=''*|Oa=o) (this is used in E'^"* 
below) . The last term on the RHS of Eq. (fT6|) is given as 



cal evalution for these contributions are demanding, and 
beyond the scope of this paper. 



C. Total energy 

Once is given, we can calculate the total energy 
based on the adiabatic connection formalism [23l . [sol . [ssl . 
l36j . Let us imagine an adiabatic connection path where 

the one-body Hamiltonian = + evolves into 
the total Hamiltonian H , which is written as 

H = + + V'=''\ (12) 
^' = E /*V3i(r)(-^)^.(r), (13) 

(T 

t>°"* = 5Z / dvVS^\v)n^{y), (14) 
y'='= = i^ Jdrdr'v{r,r') x 

i^Ur)4'l{r')i'a'{r')Mr)- (15) 

V"^ is also defined with instead of in Eq. (fTI)l . 
We use standard notation for the field operators i/'<T(r), 
spin index cr, and external potential V°^^{r). We omit 
spin indexes below for simplicity. 

A path of adiabatic connection can be parametrized 
by X as = H" + A(t>'='^' - V"^ + 1>°'=). Then the total 
energy E is written as 



(16) 

I 

E^ + E"" + E"", where 

E^ = ^j\xnx{r)v{r,r')nxir'), (17) 

E- = -^l\x\nx{ry)\Mr,r'), (18) 

E'^^j\x{Ox\VnO).)-E''-E^ (19) 

Here we used nA(r,r') = (OA|«'t(i")*(r')|OA)- 

We define the Ist-order energy E^^^ as the total energy 
neglecting E^: 

E^""' = E^ + E^""' + E^ + E^, (20) 

where subscript means that we use nA=o(r) instead of 
nA(r) (and same for nA(r, r')) in the definition of E'^'^'^ , 



E = E'' 



^ dE^ 

rfA— dA(OA|V^^'^*-V^'="|OA)+ / dA(OA|y^lOA), 
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and E""; E^ = (Oa=o|^''|Oa=o)- This is the HF-Ukc 
total energy, but with the QPcig given by V^^ . 
E"^ is written as 



1 



(21) 



where Il\ is the proper polarization function for the 
ground state of H^. The RPA makes the approxima- 
tion Il\ w n>,=o (nA=o is simply expressed as 11 below). 
The integral over A is then trivial, and 



^cRPA ^ _Ti.[log(l - vU) + vH], 



-1 



E 



RPA 



E' 



E' 



,RPA 



(22) 



E^^^ denotes the RPA total energy. 11 is given by 
the product of non-interacting Green's functions 11 = 
— iG" X G", where G" is calculated from V^^ . Thus 
we have obtained the total energy expression E^^^ for 
QSGW. As we have the smooth adiabatic connection 
from A = to A = 1 in QSGW(from bare QP to dressed 
QP) as discussed in previous section, we can expect that 
we will have better total energy than E^^^ where we use 
the KS eigenfunction and eigenvalues (where the band 
gap can change much from bare QP to dressed QP). E^^^ 
will have characteristics missing in the LDA, e.g. physical 
effects owing to charge fluctuations such as the van der 
Waals interaction, the mirror force on metal surfaces, the 
activation energy, and so on. However, the calculation of 
^c,RPA numerically very difficult, because so many un- 
occupied states are needed. Also deeper states can couple 
to rather high-energy bands in the calculation of H. Few 
calculations have been carried out to date [bsI. [sgI. [stI. [sst . 
As far as we tested within our implementation, avoiding 
systematic errors is rather difficult. In principle, the ex- 
pression E^^^ is basis-independent; however, it is not 
so easy to avoid the dependence; for example, when we 
change the lattice constant in a solid, E'^'^^^ artificially 
changes just because of the changes in the basis sets. 
From the beginning, very high-level numerical accuracy 
for E^^^ required; very slight changes of E'^'^^^ results 
in non-negligible error when the bonding originates from 
weak interactions such as the van der Waals interaction. 
These are general problems in calculating the RPA-level 
of correlation energy, even when evaluated from Kohn- 
Sham eigenfunctions. 

QSGM^with Eq. ^ or Eq. ^ can result in multiple 
self-consistent solutions for G" in some cases. This situ- 
ation can occur even in HF theory. For any solution that 
satisfies the self-consistency as Eq. pUj) or Eq. (fTTI) . we 
expect that it corresponds to some metastable solution. 
Then it is natural to identify the lowest energy solution 
as the ground state, that is, we introduce a new assump- 
tion that "the ground state is the solution with the low- 
est total energy among all solutions". In other words, 
the QSGiy method may be regarded as a construction 
that determines by minimizing E^^^ under the con- 
straint of Eq. pUj) (or Eq. ([TI]) '). This discussion shows 



how QSGM^is connected to a variational principle. The 
true ground state is perturbatively constructed from the 
corresponding H'^. However, total energy minimization 
is not necessary in all cases, as shown in Sec. IIIII We 
obtain unique solutions (no multiple solutions) just with 
Eq. (fTUl) or Eq. pT]) (Exactly speaking, we can not prove 
that multiple solutions do not exist because we can not 
examine all the possibilities. However, we made some 
checks to confirm that the results are not affected by ini- 
tial conditions). In the cases we studied so far, multiple 
solutions have been found, e.g. in GdN, YH3 and Ce 
[isl . [39! . These cases are related to the metal- insulator 
transition, as we will detail elsewhere. As a possibil- 
ity, we can propose an extension of QS GW, namely to 
add a local static one-particle potential as a correction to 
Eq. (fTUl) . The potential is controlled to minimize E^^^. 
This is a kind of hybridization of OS G W with the opti- 
mized effective potential method [23 . See Appendix [B] 
for further discussion as to why the total energy mini- 
mization as functional of V^^ is not a suitable way to 
determine V^^ . 

Finally, we discuss an inconsistency in the con- 
struction of the electron density within the QSGM^ 
method. The density used for the construction of 
in the self-consistency cycle is written as n'~^ (r) = 

/ da;G°(r,r,a;)e**", which is the density of the non- 
interacting system with Hamiltonian H'^. On the other 
hand, the density can be calculated from E^^^ by the 
functional derivative with respect to y'^*. Since E^^^ is 
a functional of V'^^ , we write it as E^^^ \V'^^^] ; its deriva- 
tive gives the density (r) = ^s^. The difference 
in these two densities is given as 



(r) 



= ^ |dl|d2(S(l, 2) - y-(l, 2)) ^gl|, (23) 

where V^'^ is the static non-local potential defined in 
Eq. (fTO]) or Eq. (fTTj) . This difference indicates the size 
of inconsistency in our treatment; from the view of the 
force theorem (Hellman-Feynman theorem), we need to 

identify (r) as the true density, and n^" (r) for 

as the QP density. We have not evaluated the difference 
yet. 



II. GW METHODOLOGICAL DETAILS 

A. Overview 

In the full-potential linear muffin-tin orbital method 
(FP-LMTO) and its generalizations, eigenfunctions are 
expanded in linear combinations of Bloch summed 
mufhn-tin orbitals (MTO) XjiLji^) '^^ wave vector k as 



(24) 



RLj 
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n is the band index; ^'kn(r) is defined by the (eigen- 
vector) coefScients ZRij_„ and the shape of the Xfllj ('') ■ 
The MTO we use here is a generalization of the usual 
LMTO basis, and is detailed in Refs.[lll, H^l- R identi- 
fies the site where the MTO is centered within the prim- 
itive cell, L identifies the angular momentum of the site. 
There can be multiple orbitals per RL; these are labeled 
by j. Inside a MT centered at R, the radial part of x 
is spanned by radial functions {ipRi, ipm or (pRi, ipm, 
ip]^i) at that site. Here (pm is the solution of the radial 
Schrodinger equation at some energy (usually, for I 
channels with some occupancy, this is chosen to be at 
the center of gravity for occupied states), (pm denotes 
the energy-derivative of ipm] ipf^i denotes local orbitals, 
which are solutions to the radial wave equation at en- 
ergies well above or well below e^. We usually use two 
or three MTOs for each I for valence electrons (we use 
just one MTO for high I channels with almost zero oc- 
cupancy). In any case these radial functions are repre- 
sented in a compact notation {fRu}- u is a compound 
index labeling L and one of the {^ri, i^ri, Pri) triplet. 
The interstitial is comprised of linear combinations of en- 
velope functions consisting of smooth Hankel functions, 
which can be expanded in terms of plane waves [4]| . 

Thus \E'kn(r) in Eq. can be written as a sum of 
augmentation and interstitial parts 

*k„(r) = J2^ll^Rui^) + J2(^oPhir), (25) 

Ru G 

where the interstitial plane wave (IPW) is defined as 




if r e any MT 



u II r t any 

exp(i(k 4- G) • r) otherwise 



(26) 



and (fi^^ are Bloch sums of (pRu 



V'LW = ^¥>i?„(r-R-T)exp(»k.T). (27) 



T and G are lattice translation vectors in real and re- 
ciprocal space, respectively. Eq. ([25|l is equally valid 
in a LMTO or LAPW framework, and eigenfunctions 
from both t ype s of methods have been used in this GW 
scheme j4^l43j. Here we restrict ourselves to (general- 
ized) LMTO basis functions, based on smooth Hankel 
functions. 

Throughout this paper, we will designate eigenfunc- 
tions constructed from MTOs as VAL. Below them are 
the core eigenfunctions which we designate as CORE. 
There are two fundamental distinctions between VAL 



and CORE: first, the latter are constructed indepen- 
dently by integration of the spherical part of the LDA po- 
tential, and they are not included in the secular matrix. 
Second, the CORE eigenfunctions are confined to MT 
spheres H^]. CORE eigenfunctions are also expanded 
using Eq. (j25p in a trivial manner (/3q" = and only 
one of a^'^ is nonzero); thus the discussion below applies 
to all eigenfunctions. VAL and CORE. In order to obtain 
CORE eigenfunctions. we calculate the LDA Kohn-Sham 
potential for the density given by 7?°, and then solve the 
radial Schrodinger equation. In other words, we substi- 
tute the nonlocal potential with its LDA counterpart 
to calculate CORE. More details of the core treatment 
are given in Sec. Ill Bl 

We need a basis set (referred to as the mixed basis) 
which encompasses any product of eigenfunctions. It is 
required for the expansion of the Coulomb interaction v 
(and also the screened interaction W) because it connects 
the products as Through Eq. (gS]), products 

^km X ^'k2n' can be expanded by P^^^^^{r) in the in- 
terstitial region because Pgi(i") ^ ^G2(^) = -fcl+Gal^)- 
Within sphere i?, products of eigenfunctions can be ex- 
panded by B^]^^^{r), which is the Bloch sum of the 
product basis (PB) {i?fl„i(r)}, which in turn is con- 
structed from the set of products {(^_R„(r) x </5_Ru'(r)}. 
For the latter we adapted and improved the procedure 
of Aryasetiawan [i^. As detailed in Sec. Ill Cl we define 
the mixed basis {Mf{r)} = {-Pg('')' ^flm ('")}' where the 
index / = {G, Rm} classifies the members of the basis. 
By construction, Mf is a virtually complete basis, and 
efficient one for the expansion of ^'kn products. Com- 
plete information to generate the GWA self-energy are 
matrix elements qn]'^ q-Un' Mf) , the eigenvalues ekn, 
the Coulomb matrix vjjCk) = {Mf\v\M^), and the over- 
lap matrix (Afj^jAfj). (The IPW overlap matrix is neces- 
sary because (PgI^g') ^ for G 7^ G'.) The Coulomb 
interaction is expanded as 



^(r,r') = J2 \Mf)vuik){M^\ 



(28) 



k,/,J 



where we define 



I' 

0^,,= {Mf,\Mf). 



(29) 
(30) 



W and the polarization function H shown below are ex- 
panded in the same manner. 



J 



The exchange part of E is written in the mixed basis as 



(*q„|S,l*. 



BZ occ 



€{111/ 



}^ ^ i^qr, I *q_k„' Mf) VI J (k) ( A/^*q_k„, | 



(31) 
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It is necessary to treat carefully the Brillouin zone (BZ) summation in Eq. ((3T|) and also Eq. p4|) because of the 
divergent character of ■y/j(k) at k ^ 0. It is explained in Sec. Ill El 

The screened Coulomb interaction Wjj (q, oj) is calculated through Eq. ([3]) , where the polarization function is 
written as 

n/.(q,-) =1.LL c.-(e,^,„,-Sk„) + ^<5 

^ BZ unocc^ (M,^Vl'k»|^q+k»0(^q+k»>|^knM^) 

k « -c^-(ek„-eq+k„') + «'5 

When time-reversal symmetry is assumed, 11 can be simplified to read 

BZ occ unocc 



nzj(q,C^) =Y.Y.Y1 (^^7^*k«|*q+k„')(*q+kn'|*knAf^) 



k n n' 

X ( ^) (33) 

Wc developed two kinds of tetrahedron method for the Brillouin zone (BZ) summation entering into 11. One follows 
the technique of Rath and Freeman The other, which we now mainly use, first calculates the imaginary part (more 
precisely the anti-hermitian part) of 11, and determines the real part via a Hilbert transformation (Kramers-Kronig 
relation); see Sec. Ill Dl The Hilbert transformation approach significantly reduces the computational time needed to 
calculate 11 when a wide range of lu is needed. A similar method was developed by Miyake and Aryasetiawan (47| . 
The correlation part of S is 

BZ All 

(*q„|Se(c^)|*q™) = ;^^^(*q„|*q_k„'M/)(Mj*q-k„'|*q,n) 
k n' IJ 

l^W}Ak,u;')^- ^ — . (34) 



where = W—v {—i6 must be used for occupied states, 
+id for unoccupied states). Sec. Ill Fl explains how the uj- 
integration is performed. 



B. Core treatment 

Contributions from core (or semi-core) eigenfunctions 
require special cares. In our GW, CORE is divided into 
groups, COREl and C0RE2. Further, VAL can be di- 
vided into "core" and "val" . Thus all eigenfunctions are 
divided into the following groups: 



All eigenfunctions 



(35) 



CORE 
COREl C0RE2 



core 



VAL 

"val". 



VAL states are computed by the diagonalization of a 
secular matrix for MTOs; thus they are completely or- 
thogonal to each other. VAL can contain core eigenfunc- 
tions we denote as "core" . For example, we can treat the 



Si 2p core as "core" . Such states are reliably determined 
by using local orbitals, tailored to these states [Tl| . 

COREl is for deep core eigenfunctions. Their screen- 
ing is small, and thus can be treated as exchange-only 
core. The deep cores are rigid with little freedom to be 
deformed; in addition, CORE24-VAL is not included in 
these cores. Thus we expect they give little contribution 
to n and to Sc for C0RE2-I-VAL. Based on the division 
of CORE according to Eq. ((551) . we evaluate S as 



s = e: 



COREl 



^CORE2+VAL 



^CORE2-|-VAL 



(36) 



(We only calculate the matrix elements (\I'i|S|^'j), where 
i and j belongs to C0RE2-hVAL, not to COREl.) We 
need to generate two kinds of PB; one for Ej°^^^, the 
other for sCORE2+val 5.core2+val^ ^s explained 

in Sec. Ill CI these PB should be chosen taking into ac- 
count what combination of eigenfunction products are 
important. States C0RE2-I-VAL are usually included in 
n, which determines W. Core eigenfunctions sufRciently 
deep (more than ~ 2 Ry below Ep), are well-localized 
within their MT spheres. For such core eigenfunctions, 
we confirmed that results are little affected by the kind 
of core treatments (COREl, C0RE2, and "core" are es- 
sentially equivalent); sec Rcf. 
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As concerns their inclusion in the generation of 11 and 
S, Eq. ([361) means that not only VAL but also C0RE2 
arc treated on the same footing as "val". However, we 
have found that it is not always possible to reliably treat 
shallow cores (within ~ 2 Ry below Ep) as C0RE2. Be- 
cause CORE eigenfunctions are solved separately, the or- 
thogonality to VAL is not perfect; this results in a small 
but uncontrollable error. The nonorthogonality problem 
is clearly seen in vU{c[,uj) as q ^ 0: cancellation be- 
tween denominator and numerator becomes imperfect. 
(We also implemented a procedure that enforced orthog- 
onalization to VAL states, but it would sometimes pro- 
duce unphysical shapes in the core eigenfunctions.) Even 
in LDA calculations, MT spheres can be often too small 
to fully contain a shallow core's eigenfunction. Thus we 
now usually do not use C0RE2; for such shallow cores, 
we usually treat it as "core" S VAL; or as COREl when 
they are deep enough. We have carefully checked and 
confirmed the size of contributions from cores by chang- 
ing such grouping, and also intentional cutoff of the core 
contribution to W and so on; see Ref. jT]| . 



C. Mixed basis for the expansion of 5' x ^ 

A unique feature of our GW implementation is its 
mixed basis set. This basis, which is virtually complete 
for the expansion of the products 'I' x 5", is central for 
the efficient expansion of products of relatively localized 
functions, and essential for proper treatment of very lo- 
calized states such as core states or / systems. Products 
within a MT sphere are expanded by the PB procedure 
originally developed by Aryasetiawan [i^. We use an 
improved version explained here. For the PB construc- 
tion we start from the set of radial functions {(j>is{r)}, 
which are used for the augmentation for \1/ in a MT site. 
I is the principal angular momentum, s is the other index 
(e.g. we need s = 1 for 0, and s = 2 for in addition 
to s = 3,4, .. for local orbitals and cores). The products 
4>is{r) X (j)i's'{r) can be re-ordered by the total angular 
momentum Ip = l + l', ...,\l — l'\. Then the possible prod- 
ucts of the radial functions are arranged by Ip. To make 
the computation efficient, we need to reduce the dimen- 
sion of the radial products as follows: 

(1) Restrict the choice of possible combinations 4>is{r) 
and (j)i's'{r). In the calculation of W, one </> is 
used for occupied states, the other for unoccupied 
states [Hi. In the calculation of {"i'kn'\G xWl'i'i^n) , 
^km ^ *kn appears, with ^f^^ coming from G. 
Thus all possible products can appear; however, we 
expect the important contributions come from low 
energy parts. Thus, we define two sets $occ and 
^UNOCC as the subset of {(j)is{r)}. $occ includes 
ipisir) mainly for occupied states (or a little larger 
sets), and $unocc is $occ plus some ^(^(r) for 
unoccupied states (thus f&occ € $uNOCc)- Then 
we take all possible products of 4)is{r) x ^(/^/(r) for 



4>is{r) e $occ and (j)i's'{r) € <I>uNOCC- Follow- 
ing Aryasetiawan (45j . we usually do not include 
0-kinds of radial functions in these sets (we have 
checked in a number of cases that their inclusion 
contributes little). 

(2) Restrict Ip to be less than some cutoff l^^^. remov- 
ing expensive product basis with high Ip. In our 
experience, we need l^^^ = 2x (maximum I with 
non-zero (or not too small) electron occupancy) is 
sufficient to predict band gaps to ^ O.lcV, e.g. we 
need to take l^^^ ~ 4 for transition metal atoms. 

(3) Reduce linear dependency in the radial product 
basis. For each Zp, we have several radial prod- 
uct functions. We calculate the overlap matrix, 
make orthogonalized radial functions from them, 
and omit the subspace whose overlap eigenvalues 
are smaller than some specified tolerance. The tol- 
erance for each can be different, and typically tol- 
erances for higher ^p can be coarser than for lower 
/p. 

This procedure yields a product basis, ~ 100 to 150 func- 
tions for a transition metal atom, and less for simple 
atoms (see Sec. IIII Al for GaAs). 

There are two kinds of cutoffs in the IPW part of the 
mixed basis: |q -I- Gj^^^ for eigenfunctions Eq. (^5)) . and 
|q-|- G|}Jg^^ for the mixed basis in the expansion of W. 
In principle, |q -t- G\^^^ must be 2 x |q -I- Gl^^^^ to span 
the Hilbert space of products. However, it is too ex- 
pensive. The computational time is strongly controlled 
by the size of the mixed basis. Thus wc usually take 
small Iq+GjJJ'j^^, rather smaller than |q + G|^g^^ (the 
computational time is much less strongly controlled by 
|q+ G|Max)- As wc illustrate in See ing -0.1 cV level 
accuracy can be realized for cutoffs substantially below 

2x|q+G|Lx- 

For the exchange part of COREl, we need to construct 

another PB. It should include products of COREl and 

VAL. Wc construct it from (/)corei x-i/jvAL; where V'val G 

<f>uNOCCi so as to make it safer ($unocc is bigger than 

*occ)- 

We also have tested other kinds of mixed basis which 
are smoothly augmented; we augment IPWs and con- 
struct smoothed product basis (value and slope vanish- 
ing at the MT boundary). However, little computational 
advantage was realized for such a mixed basis. 



D. Tetrahedron method for W 

Eq. (|32p requires an evaluation of this type of BZ in- 
tegral; 

X(,,^ /(£a(k))[l-/(£b(k))] 

A(W)=> Jk -, TTT (37) 

^ Lu - (£b(k) - ea(k)) ± iS 

where is a matrix element and /(e) is the Fermi 
function. To evaluate this integral in the tetrahedron 
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method, we divide the BZ into tetrahedra. Tk is re- 
placed with its average at the four corners of k, T^. 
We evaluate the integral within a tetrahedron, linearly 
interpolating ea(k) and eb(k) between the four corners 
of the tetrahedron. In the metal case, we have to di- 



vide the tetrahedra into smaller tetrahedra; in each of 
them /(£a,(k))[l — /(eb(k))] = 1 or = are satisfied; see 
Ref. [46|. Thus we only pick up the smaller tetrahedron 
Ail satisfying /(ea(k))[l — /(eb(k))] = 1 and calculate 



-ImAX{Lu) = Tk / d'kSiij - (eb(k) - ea(k))), 
7^ J An 



(38) 



Based on the assumption of linear interpolation, the integral in lmAX{uj) equals the area of the cross section of 
tetrahedron in a plane specified by energy w = eb(k) — £a(k). We create a histogram of energy windows [uj{i),uj{i + 

1)], i = 0, 1, by calculating the weight falling in each window as J^l^^i^^^ lmAX{u!)duj. We take windows specified 

as LL){i) ^ a i + b (i = 0, 1, ..), where we typically take a ^ 0.05eV and 6 ~ 1 — 10 cV. Summing over contributions 



from all tetrahedra, we finally have 

ImX([w(i),w(i-M)]) / dwImX(w) = ^WklO^k 

Juj{i) 

Applying this scheme to Eq. ([33|) . we have 

Imn(g, [u;{^),c.{^ + 1)]) = ^ <""'(*)r^^ 



-icinn' IJ 



(39) 



(40) 



where 



nn IJ 



Imll (Kramers-Kronig relation). 



(Afj'^'kn|^'q+kn')('^'q+kn'|^kn-^^j)- Tlic real part of n is calculated through a Hilbert transform of 



r 



Some further considerations arc as follows. 

(A) For band index n, £kn rnay be degenerate. When 
it occurs, we merely symmetrize w^"" (i) with respect to 
the degenerate Ekn- 



(B) When Eq. (j32p is evaluated without time-reversal 
symmetry assumed, windows for negative energy must 
be included because lmU{q, —uj) ^ lmIl{q,Lj). 

(C) In some limited tests, we found that linearly inter- 
polating Tk within the tetrahedron, instead of using Tki 
did little to improve convergence. 

(D) We sometimes use a "multi-divided" tetrahedron 
scheme to improve on the resolution of the energy de- 
nominator. We take a doubled k-mesh when generating 
w^"" (i). For example, we calculate 11 with a 4x4x4 



k-mesh for k sum in Eq. (|40f 
when we calculate w^"" (i). 



but we use a 8x8x8 mesh 

Then the improvement of 
n is typically intermediate between the two meshes: we 



J 



obtain results between the 4x4x4 and 8x8x8 results in 
the original method. 

E. Brillouin-zone integral for the self-energy; the 
smearing method and the ofFset-F method. 

1. Smearing method 



To calculate Sx and Sc, Eqs. ([31]) and ([M]) . each pole 
at Eq-kri' is slightly smeared. The imaginary part, pro- 
portional to 6{uj ~ £q-kn')' is broadened by a smeared 
function 6(u! — Sq-kn')- In order to treat metals, this 
smearing procedure is necessary. Usually we use a Gaus- 
sian for the smeared function 



1 



27rcr 



exp(- 



2a- 



r), 



(41) 



though other forms have been tested as well. 

We explain the smearing procedure by illustrating it 
for Ex- Eq- ((3T|) becomes 



BZ 



(^'q„|Ex|^'q™) = ^pq„™(k), 
k 

all 

Pq„,„(k) = 5m ^(^F - eq-k„')(*q«l*q-kn'M/^)f/,7(k)(M;^«'q_kn'|^ 



qjn/ 



(42) 
(43) 



n' IJ 
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where 9{uj) is a smeared step function. d9{uj)/duj = 6{uj). 
Owing to the sudden Fcrmi-cnergy cutoff in the metals 

case, Pqnm 

(k) may not vary smoothly with k. Increas- 
ing a smooths out pq„,„ (k) , making it more rapidly con- 
vergent in spacing between k-points, at the expense of 
introducing a systematic error to the fully k-converged 
result. With a denser k mesh, smaller a can be used, 
which reduces the systematic error. In practice we can 
obtain converged results for given a with respect to the 
number of k points, and then take the cr — > limit. 

2. Offset-F method 

The integrand in Eq. ([^^ contains divergent term pro- 
portional to l/|kp for k — ^ 0. In order to handle this 



divergence we invented the offset-F method. It was origi- 
nally developed by ourselves Q and is now used by other 
groups [H, li^l ; it is numerically essentially equivalent 
to the method of Gygi and Baldereschi [501, where the 
divergent part is treated analytically. 



We begin with the method of Gygi and 
Baldereschi The Coulomb interaction i;/,7(k). 

which is a periodic function in k. includes a divergent 
part v^jjik) cx U'^{k)U^*{k)F{k) where F{k) ^ l/\k\'^ 
as k ^ 0. Uj{k) are coefficients to plane wave e**"" in the 
mixed basis expansion. They divided the integrand into 
two terms, one with no singular part which is treated 
numerically; the other is a combination of analytic 
functions that contain the singular part: 



BZ BZ 



(*q„|E,|^'q„,) = ^(pq„„(k)-pO„„(k))-t-^q„,„^i?^(k), (44) 

k k 

where 

pO„„(k) = ylq„™F(k) (45) 

Aqnm = fim |kpPq„„(k). (46) 
k^O 



As for the first term on the right-hand side (RHS) of 
Eq. (|44|) . the integrand Pqnm{k) — Pq„,„(k) is a smooth 
function in the BZ with no singularity (more precisely, 
it can contain a part cx fca;/|kp, however, it adds zero 
contribution around k = because it is odd in k); it is 
thus easily evaluated numerically. The second term is 
analytically evaluated because F{k) is chosen to be an 
analytic function as shown below. We evaluate the first 
term by numerical integration on a discrete mesh in a 
primitive cell in the BZ. The mesh is given as 

k{n,^2,^s) = 24^bi + ^b, + ^b3), (47) 



BZ 

E 



NIN2N3 



and has the requisite divergence at k — > 0. (Gygi and 
Baldereschi used a different function in Ref. [501, t>ut it 
satisfies the same conditions.) We can easily evaluate 
F{k) analytically. Thus it is possible to calculate Sx if 



we can obtain the coefficients Aa 



However, it is not 



easy to calculate Aqnm- This is especially true for Sc, 
Eq. ([34|) . where the coefficients are energy-dependent. 
The offset-F method avoids explicit evaluation of 
while retaining accuracy essentially equivalent to 



the method described above. It evaluates the k-integral 
in Eq. (|42]) as a discrete sum 



BZ 

E 



1 



^ N1N2N3. . . 



(49) 



Forms of the analytic functions F{k) are chosen so that 
it can be analytically integrated. We choose F{k) as 



All 



^(k) = E 



exp(-alq-h Gp) 



(48) 



where G denotes all the inverse reciprocal vectors and a 
is a parameter. -F(k) is positive definite, periodic in BZ 



where ^ denotes the sum for k(ii,i2,Z3) but k = is 
replaced by the offset-F point Q = ((3,0,0). Q is near 
k = 0, and chosen so as to integrate F{k) exactly: 



BZ 



E^(k) 



1 



N1N2N3 



E ^w- 



(50) 



Then Eq. (|35]) becomes 
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(*q„|Sx|*qm) 



1 



y ] Pqrim(k) 



(51) 



r 



For larger N1N2N3, ((9,0,0) is closer to (0,0,0), thus 
the first term on the RHS of Eq. ([51]) is little different 
from the sum with the mesh Eq. (|17)) . Then the second 
term in Eq. ([STjl is exactly the same as the second term 
in Eq. because of Eq. ([50]) . Thus the simple sum 

can reproduce the results given by the 

N1JV2N3 ^ — ^ 

il ,12,13 

method Eq. ([44]) . 

In practical applications, we have to take some set of 
Q points to preserve the crystal symmetry. Typically 
we prepare six Q points, {±Q, 0, 0), (0, ±Q, 0), (0, 0, ±Q), 



J 



and then generate all possible points by the crystal sym- 
metry. The weight for each Q should be l/NiN2Ns di- 
vided by the total number of Q points. The value of Q 
is chosen to satisfy Eq. ([50| . It evidently depends on 
the choice of F{h); in particular when F is given by 
Eq. ((48|) . Q depends on a. A reasonable choice for a 
is a — > (then no external scale is introduced). How- 
ever, we found a = 1.0 a.u. is small enough for simple 
solids, and the results depend little on whether a = 1.0 
or a ^ 0. 

In addition, we make the following approximation: 



all 



Pqn(Q) ~Y.J2^(^P - £q«')(*q«l*qn'^^7^=°>l'/./(Q)(A^J=°*qn'l*q 

I 



(52) 
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That is, the matrix element is not evaluated at k = Q but 
at k = 0. This is not necessary, but it reduces the com- 
putational costs and omits the contribution cx A:^/|kp 
which gives no contribution around k = 0. 

Finally, we use crystal symmetry to evaluate Eq. (|43| : 
vjj(k) and also VFjj(k) are calculated only at the irre- 
ducible k points and at the inequivalent offset-F points. 

We also tested a modified version of the offset-F 
method with another kind of BZ mesh (it is not used for 
any results in this paper). The uniform mesh is shifted 
from the F point: 



k(«i,«2,«3) = 27r 



bi 



«2 



N2 



^h2 



1 



(53) 



Then we evaluate the BZ integral as 



BZ 



^F(k)^ ^ W^Fik) 



W, 



Q 



N1N2N3 



F(Q). (54) 



where k(il,i2,i3) is used. Wq is a parameter given 
by hand to specify the weight for the offset-F point Q. 
(In principle, j^^^p^F{Cl) must give no contribution as 
Ni,N2,N3 00. Thus Eq. ((M]) can be taken a trick 



to accelarate the convergence on N1N2N3; this is nec- 
essary in practice). We usually use a small value, e.g., 
Wq ~ 0.01 or less, but taken not too small so that it does 



not cause numerical problems. Integration weights are 
Wk = l/(iVi7V2A'^3) except for those closest to F. These 
latter VF(shortcst k) are chosen so that the sum of all Wu 
and Wq is unity. Then Q is determined in the same man- 
ner in the original offset-F method. This scheme picks 
up the divergent part of integral correctly, and can ad- 
vantageous in some cases, in particular for oddly shaped 
Brillouin zones. 

The original method with Eq. (|T7l) has difficulty 
in treating anisotropic systems like a one-dimensional 
atomic chain. In such a case, we can not determine rea- 
sonable Q for the BZ division for, e.g., (A^i = 7V2 = 1 and 
A^3 = large number), while the modified form Eq. ([55]) 
has no difficulty. We can choose Q close to F (any Q 
can be chosen if it is close enough to F). As Wq be- 
comes smaller, so does Q. Two final points relevant to 
the modified version are: 

• In some anisotropic cases (e.g. anti-ferromagnetic 
II NiO), we need to use negative Wq because the 
shortest k on regular mesh is already too short 
and the integral of F{q) evaluated on the mesh of 
Eq. ([551) already exceeds the exact value. However, 
the modified version works even in such a case. 

• In some cases (e.g. Si), the shifted mesh can be 
somewhat disadvantageous because certain symme- 
try operations can map some mesh points to new 
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points not within the shifted mesh, Eq. ((53|) . Then 
the QP energies that are supposed to be degenerate 
are not precisely so for numerical reasons. One so- 
lution is to take denser k mesh to avoid the effect 
of discretization. Our current implementation al- 
lows us to compute 11 and S with different meshes, 
Eq. ((TTl) or Eq. (|53|) . This is sometimes advanta- 
geous to check the stability of calculations with re- 
spect to the k mesh. 



where oj^ = — £q_kn'. Here we fix indexes /, J, k, q, n' 
and omit them for simplicity. We use a version of the 
imaginary-path axis integral method [sil . Is^ ] : the inte- 
gration path is deformed from the real-axis to the imag- 
inary axis in such a way that no poles are crossed; see 
Fig. [TJ As a result, X is written as the sum of an imag- 
inary axis integral Ximg and pole contributions X^^aX- 

.^imp- is 



F. Li;'-integral for 



Eq. (p4)l contains the following integral 



X 



°° iduj' 1 



27r 



(55) 



Xu 



1 c I J 1 r 

duj'Wsiu;') +-/ du'WAioj')- 



(56) 



where Ws(w') = [W%iLu') + W''{-iuj')]/2 and VFa(w') = 
[W%iuj')-W%-iuj')]/2i. Note that Wa(w') ^ unless 
time-reversal symmetry is satisfied. Xjmg adds a hermi- 
tian contribution to T,'^. 

We have to pay attention to the fact that Ws{lli') is 
rather strongly peaked around u' = 0, and follow the 
prescription by Aryasetiawan [s^ to evaluate the first 
term in Eq. ([55]) : 

(i) Divide Ws{uj') into Ws{Q) exp{-a'^uj''^) and the 
residual, Ws{uj') — H^s(O) exp(— a^tj'^). a is a pa- 
rameter. The first integral is performed analyti- 
cally, the residual part numerically. We chose a in 
one of two ways, either to match d'^Ws{u!')/dLu''^ at 
Lo' = 0, or use a = 1.0a. u. (as originally done by 
Aryasetiawan). Then we find the latter is usually 
good enough, in the sense that it does not impose 
additional burden on numerical integration of the 
residual term. 

(ii) The residual term is integrated with a Gaussian 
quadrature in the interval x = [0,1], making the 
transformation cu' = x/{l — x) a.u. (as was done 
first by Aryasetiawan). Typically, a 6 to 12 point 
quadrature is sufficient to achieve convergence less 
than 0.01 eV in the band gap. 

(iii) On Ximg, we did not include smearing of the pole 



^q— kn' 



as explained in Sec. HIE 11 because it gives 



little effect, although it is necessary for the eval- 
uation on Xroai as described below. However, we 
add another factor to avoid a numerical problem; 
we add an extra factor 1 — exp[— (w'^ + uj^')l1o?\ 
in the integrand of the numerical integration. This 



avoids numerical difficulties, since lo^/{uj' -t-w^^) 
can be large. In our implementation, we simply 
fix a in (iii) to be the same as a for the smear- 
ing in Eq. (I41[) and check the convergence. Gener- 
ally speaking, a — 0.001 a.u. is satisfactory (the 
differences are negligible compared to tr ^ re- 
sults). This procedure is not always necessary, but 
it makes calculations safer. 



Thus the analytic part proportional to VFs(O) is 
-W^s(O) 



Weexp(— a^o; ) 1 - e ^ 



du'- 



X < erfc(aa-'e) — crfc lo^aIo? 



where we use a formula 



V (2a2 



(57) 



2/3^ 



2erfc(/3/i). Here erfc(...) is the complementary error 

function (to check this formula, differentiate with respect 
to /.t on both sides). For small a, we expand Eq. (|57p in 
a Taylor series in a, to keep numerical accuracy. 
Next, let us consider Xroai- It has three branches 

'WiuJe) if w -£;f > > 

^rcal = { -WiuJe) iiQ> UJe> U) ~ Ef (58) 

otherwise 



Poles are smeared out as discussed in Sec. Ill El eq_k)i' 
(thus We) has a Gaussian distribution. This means that 
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FIG. 1: Integration contour of the correlated part of the 
self-energy, Eq. (|56|l . The original path along real axis 
is deformed without crossing poles. G(q — k, — uo') — 
J2n ^/('^ — a;' — Eq-kn) has poles at to' = lj — Eq-kn. 



we take ±wW'^{lJ^) instead of Eq. ((58|l . where w is the 
weight sum falling in the range [0, w — £'f] (or [w — Sp, 0]), 
and oJ^ is the mean value of oje in the range. The QP 
lifetime (which originates from the non-hermitian part 
of S'^) comes entirely from the imaginary part of Xycai- 
Thus the condition that the lifetime is infinite (no imag- 
inary part) at Lu — Ep is assured from the path shown in 
Fig. [U since lu ~ Ep means no contour on real axis. The 
lifetime of a QP is due to its decay to another QP accom- 
panied by the states included in Im[VF'^]; these states can 
be independent electron-hole excitations, plasmon-like 
collective modes, local collective charge oscillations (e.g. 
d electrons in transition metals oscillate with ~5 eV [53|). 
and also their hybridizations. In the insulator case, there 
is "pair-production threshold" : if the electron QPE is be- 
low (conduction band minimum) its imaginary part 
is zero because it can not decay to another electron ac- 
companied by an electron-hole pair (similarly for holes): 
the imaginary part of the QPE for low energy electrons 
(holes) is directly interpreted as the impact ionization 
rate. We will show results elsewhere; to calculate QPE 
lifetimes, a numerically careful treatment is necessary- 
especially for Imiy^. 



AV" = 



{*k„,ek„} 




n -> w -> 







FIG. 2: Self-consistency cycle. Cycle consists of a large loop 
^LDA ^ ^^xc ^ _^ Zi V H^°^ + .... 

There is an inner loop within the (g" = ^lda ^ step, 
where the density and LDA potential are made self-consistent 
for a fixed AV". The cycle is started by taking AV^" = 0, or 
= H^^^; the first iteration is equivalent to the standard 
Q^DA^r^DA ^j^j^ Z = 1 and including the off-diagonal parts 
of E. The time-consuming part is framed in bold. Subscripts 
to E and V^"^ refer to the basis it is represented in; see text. 

The self-consistent cycle is shown in Fig. [51 It is initi- 
ated as in a standard GW calculation, by using i/^DA 
for i?" ("zeroth iteration"). It is a challenge to cal- 
culate the QSGW^ Eq. (Uni), efficiently. The most 
time-consuming part of a standard GW calculation is 
the generation of the polarization function, n(q, w), be- 
cause only diagonal parts of E are required. However 
in the QSGFFcase, the generation of E for Eq. (|10p is 
typically 3 to 10 times more expensive computationally. 
These two steps together (bold frame. Fig. [2]) are typi- 
cally 100 to 1000 times more demanding computationally 
than the rest of the cycle. Once a new static V^^ is gener- 
ated, the density is made self-consistent in a "small" loop 
where the difference in the QSGFKand LDA exchange- 
correlation potentials, AV^'^ = V^'^ — y^^xDA^ jg j^gp^. 
constant. V^'^ — y^^.LDA jg expected to be somewhat less 
sensitive to density variations than V^'^ itself, so the in- 
ner loop updates the density and Hartree potential in a 
way intended to minimize the number cycles needed to 
generate E sclf-consistently. 

The required number of cycles in the main loop to make 
E self-consistent depends on how good the LDA starting 
point is: for a simple semiconductor such as Si, typically 
three or four iterations arc enough for QP levels to be 
converged to 0.01 eV. The situation is very different for 
complex compounds such as NiO; the number of cycles 
then depends not only on the quality of the LDA starting 
point, but other complicating factors as well. In particu- 
lar, when some low-energy fluctuations (spin and orbital 
moments) exist, many iterations can be required for self- 
consistency. Solutions near a transition to a competing 
electronic state are also difficult to converge. 



15 



V^"^ is calculated only at irreducible k-points on the 
regular mesh, Eq. (|47p . However, we must evaluate 
zil/™(k) at continuous k-points for a viable scheme. 
In the self-consistency cycle, for example, the offset-F 
method requires eigenfunctions at other k. The ability 
to generate at any k is also needed to generate con- 
tinuous energy bands (essential for reliable determination 
of detailed properties of the band structure such as effec- 
tive masses) or integrated quantities, e.g. DOS or EELS 
calculations. Also, while it is not essential, we typically 
use a finer k mesh for the step in the cycle that generates 
the density and Hartree potential (oval loops in Fig. [2]), 
than we do in the GW part of the cycle |54| . 

The interpolation is accomplished in the (generalized) 
LMTO basis by exploiting their finite range. Several 
transformations are necessary for a practical scheme, 
which we now describe. For a given AV^'^, there are 
three kinds of basis sets: the MTOs Xklj ! basis 
in which = ijLDA _^ ^yxc -g diagonal (the "QSGIV" 
basis); and the basis ^'j^J?^ in which iJ^DA diagonal 
(the "LDA" basis). '~' over the subscripts signifies that 
the function is represented in the basis of LDA eigen- 
functions. LDA and orbital basis sets are related by the 
linear transformation Eq. For example AV^^ in 

the QSGW^ basis, Z\lC;,(k) = (*k„| /iV^"^(k) l^-k™), is 
related to AV^^ ^, ^, ^^^^{\^) ^ (x^,^-,, | ^T^'^^(k) |x^i,) 
in the MTO basis by 

^^nmO^) ^ ^ ^-R.'L'j',n'^^nfL'j',IlLiO^)'^Ti.Lj.m 
R'L'j',RLj 

(59) 

y™(k) is generated in the QSGW^basis on the irreducible 
subset of k points given by the mesh Eq. ()47p . which 
we denote as kmcsh- Generally, interpolating V^"^ 
to other k will be problematic in this basis, because of 
ambiguities near band-crossings. This problem can be 
avoided by transformation to the MTO basis (localized 
in real space). Thus, the transformation proceeds as: 

^^Tn(kmosh) ^ ^^'^L'j' ,RLj (kmcsh) ^ ^^TU L' j' ,Tl+T Lj ■ 

Eq. ((59|) is inverted to obtain j^^j(k,„osh) with 

the transformation matrix z^^j m- last step 

is an inverse Bloch transform of AV^]^,j, ^^^^(k) to 
real space. It is done by EFT techniques in order 
to exactly recover AV^j^,y Rij(k) by the Bloch sum 

,R+TLj"- It is accomplished in practice 
by rotating AV^'^' according to the crystal symmetry to 
the mesh of k-points in the fuU BZ, Eq. (|47)) . 

Prom ^Vg?2,'j',R+TLj' can make AV^^{k) in the 
LDA basis for any k by Bloch sum and basis transforma- 
tion. We then approximate ziF™(k) for the higher lying 
states by a matrix that is diagonal in the LDA basis. The 
diagonal elements are taken to be linear function of the 
corresponding LDA energy. ZiT^?(k) = ae|^^ -I- h. Thus 



we use AV^'^ as 

[ae^^ + b)5nm otherwise. 

There are two reasons for this. First, Z\T^5j(k) is well 
described at the mesh points kmosh, but often does not 
interpolate with enough precision to other k for states fifh 
associated with very high energy. We believe that the un- 
smoothness is connected with the rather extended range 
of the (generalized) LMTO basis functions in the present 
implementation. While their range is finite, the LM- 
TOs are nevertheless far more extended than, e.g. max- 
imally localized Wannicr functions or screened MTOs 
[sst . Because of their rather long range, these basis func- 
tions are rather strongly linearly dependent (e.g. the 
smallest eigenvalues of the overlap matrix can be of or- 
der 10"^''). This, when combined with small errors in 
ziT^S^(k), an unphysically sharp (unsmooth) dependence 
of the QPEs on k can result for points far from a mesh 
point. To pick up the normal part, it is necessary to use 
£'xccut2 ~ i?F + 2 Ry or so. (The largest i?xccut2 which 
still results in smooth behavior varied from case to case. 
Somewhat larger ii'xccut2 can be used for wide-band sys- 
tems such as diamond; systems with heavy elements such 
as Bi often require somewhat smaller i?xccut2)- 

A second reason for using the "diagonal" approxima- 
tion for high-energy states is that it can significantly re- 
duce the computational effort, with minimal loss in ac- 
curacy. 

In the time-consuming generation of AV^'^ (bold box 
in Fig. [5]) only a subblock of AV^^^{kn\csh) is calculated, 
for states with ekn,£km < ^'xccuti- This i^xccuti should 
be somehow larger than -E'xccut2 (£'xccut2+ 1 to 2 Ry), so 
as to obtain ziy^5j(k) with smooth k-dependence (this 
eliminates nonanalytic behavior from sudden band cut- 
offs, which prohibits smooth interpolation). 

a and h are determined by fitting them to Z\l^f(k) 
which are initially generated by a test calculation with 
somewhat larger -Exccut2- Typically the calculated diago- 
nal AV^^ilf.) is reasonably linear in energy in the window 
2 — 4 Ry above Ep^ and the approximation is a reasonable 
one. In any case the contribution from these high-lying 
states affect QPEs in the range of interest {E-p ± 1 Ry) 
very slightly; they depend minimally on what diagonal 
matrix is taken, or what ii^xccut2 is used. Even neglecting 
the diagonal part altogether (a = and 5 = 0) only mod- 
estly affects results (<0.1 eV change in QPEs). We finally 
confirm the stability of the final results by monitoring 
how QPEs depend on £'xccut2 • Generally the dependence 
of QPEs on £'xccut2 (when i?xccut2 >2 Ry) is very weak. 
Typically wc take i?xccut2 ^ E-p + 2.b to iJp + 3.5 Ry 
(larger values for, e.g. diamond). Thus we can say the 
effect of the LDA used as a reference here is essentially 
negligible. 

ijo is then given as i??^(k) = ei^^Sf,^ + AV^U^) in 
the LDA basis. 

We use LDA as a reference not only for the above in- 
terpolation scheme, but also to generate MTOs. The 
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shape of the augmented functions {^prii 'fRh fm) B^nd 
also CORE states arc generated by the LDA potential. 
However, we expect this little affects the results; sp par- 
tial waves are but little changed, and we use local orbitals 
to ensure the basis is complete. CORE are sometimes 
tested as "core". Thus we believe that out implemen- 
tation gives results which are largely independent of the 
LDA. 



III. NUMERICAL RESULTS 

Here, we first show some convergence tests, and then 
show results for some kinds of materials to explain how 
QSGVK works. 

A. Convergence Test for Mixed basis 



calculations. Comparison with the "Small" case shows 
that l^^-^ = 2 is poor; in our experience, l^^^ must be 
twice larger than the maximum I basis function that has 
significant electron occupancy. 

In addition we tested many possible kinds MTO ba- 
sis sets, similar to the tests presented for Si in Ref. [T]| . 
For GaAs, e.g., removing the 5g orbitals from Ga and 
As reduces the minimum band gap by ~ 0.05 eV; 
adding 4/xl 4- 5^x1 for floating orbitals increases it by 
~ 0.03 eV. In general we have found that it is difficult 
to attain satisfactory numerical stability (convergence) 
to better than ^0.1 cV (this is a conservative estimate; 
we probably reach ~0.05 cV or less for a simple semicon- 
ductor such as GaAs). This means that the numerical 
accuracy is mainly controlled by the quality of the eigen- 
functions input to the GW calculation, not by the cutoff 
parameters as tested here. 



Table |T] shows convergence checks of the mixed basis 
(product basis and IPWs) expansion of W, and the con- 
vergence in IPWs for eigenfunctions. We take GaAs, 
where we use 92 MTO basis set for valence: 



B. Test of k point convergence 



Ga: 4sx2-|-4px2 + 3dx2 + 4/ + 5g-|-4(i(local); 39 func- 
tions 



As: 4sx2 + 4px2 
tions 



3(ix2 + 4/-F5g + 5s(local); 35 func- 



Fl: Is + 2p + 3d: 9x2 functions. 

"Fl" denotes floating orbitals (MTOs with zero augmen- 
tation radius [ll[) centered at the two interstitial sites. 

was made self-consistent for a reference case; 
then one-shot calculations were performed systematically 
varying parameters in the mixed basis. The result is 
shown in Table U where computational cases are speci- 
fied by the two numbers (or labels "Small" and "Big" ) in 
the left-end column. These are IPW cutoffs |q+ Gj^^^j, 
and Iq+GjJJg^ introduced in Sec. Ill CI The reference 
case is denoted by 3.5,2.6 meaning |q -I- Gj^^^^ = 3.6 and 

|q+G|?Iax = 2.6. 

In the test (i) in Table U we show the convergence 
with respect to Iq+GlJ^j^^- Iq-t-Gl^j^^ = 2.6 a.u. is 
sufficient for <0.01cV numerical convergence. Test (ii) is 
for |q4-G|^g^^; this also shows that 3.6 a.u. (even 3.0 
a.u.) is sufficient <0.01eV convergence. 

We can characterize the PB by the number of radial 
functions in each / channel. In the reference case this 
consists of (5,5,6,4,3) and (6,6,6,4.2) functions for Ip = 
0,..,4 on the Ga and As sites, respectively {l^^-^ — 4 
in this case). The total number of PB is then 5x1 + 
5x3 + 6x5 + 4x7 + 3x9 = 100 on Ga and on As. The 
"Big" PB of Table U used l^^^ = 5 with (8,8,8,6,3,2) 
functions for /p=0,..,5 on Ga and As (326 PB total), while 
the "Small" PB had l^^^ = 2 with (5,5,6) and (6,6,6) 
functions for = 0, 1, 2 on Ga and As. Test (iii) shows 
that the reference PB is satisfactory; it is a typical one for 
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FIG. 3: (Color online) Convergence in the r25'i,— Xic (circles) 
and Eo = T251V — Fisc (squares) gaps in Si with the density 
of k points in the Brillouin zone, l/n|, calculated within the 
QSGW approximation. Points correspond to Uk = 4, 5, 6, 7, 8. 
Dot-dashed line is shown as a guide to the eye. 



k-point convergence in QSGiycalculations is somewhat 
harder to attain than in LDA calculation. Fig. [3] shows 
the dependence of the direct and r25t, Xic gaps on 
Uk, calculated by QSGFF( self-consistent results). Data 
are plotted as 1/7t.|, where n| = N1N2N3 is the number 
of mesh points in the BZ see Eq. ([47|) . Data is essent ially 
linear in l/n|, as it is in one-shot calculations prl. |56|. 
The figure enables us to estimate the error owing to in- 
complete k convergence by extrapolation to l/n^ — > 0. 
Fig. [3] indicates that our implementation is stable enough 
to attain the self-consistency. 



17 



GaAs Tie Fisc Xsi, Xic X3C L3„ Lie Lsc 

(i) varying Iq + GlJ^ax 

3.5,1.8 1.932 4.761 -2.973 2.026 2.452 -1.259 2.087 5.588 

3.5.2.5 2.041 4.799 -2.935 2.168 2.566 -1.245 2.189 5.652 

3.5.2.6 2.046 4.803 -2.933 2.175 2.571 -1.244 2.194 5.657 
3.5,3.0 2.053 4.809 -2.930 2.186 2.580 -1.243 2.201 5.664 
3.5,3.5 2.053 4.810 -2.929 2.188 2.581 -1.243 2.202 5^66 

(ii) varying |q + Gl^i^^ 

2.6,2.6 2.038 4.795 -2.939 2.158 2.558 -1.247 2.183 5.647 

3.0,2.6 2.045 4.801 -2.935 2.172 2.569 -1.245 2.192 5.654 

3.5,2.6 2.046 4.803 -2.933 2.175 2.571 -1.244 2.194 5^57 

(iii) varying the product basis for W 

Small 1.249 4.611 -3.202 1.702 2.133 -1.364 1.623 5.452 

3.5,2.6 2.046 4.803 -2.933 2.175 2.571 -1.244 2.194 5.657 

Big 2.057 4.811 -2.928 2.193 2.588 -1.243 2.207 5.669 



TABLE I: Three tests for GaAs, showing convergence with respect to the number of the interstitial plane waves (IPW), 
and product basis(PB). Calculations are performed for a 4x4x4 k-mesh, Eq. (|47|l (Results are not fc-converged, but the 
differences between rows will not change significantly.) QPEs are in eV, relative to the top of valence. For tests (i) and (ii), 
numbers in the first column are cutoffs in the G- vectors for wave functions (|q -I- Gl^ax) ^-^d Coulomb interaction ([q -I- Gl^ax)- 
|q-|- G| = (3.5,3.0,2.6,2.5, 1.8) a.u. correspond to (229,137,89,65,27) IPWs at q = 0. The starting H° was generated by the 
3.5,2.6 case and was not updated to perform the other tests. Test (iii) used the reference IPW cutoffs and varied the product 
basis, as described in the text. 



C. C, Si, SiC, GaAs, ZnS and ZnSe 



Tabic HnHVl show QPE within various kinds of GWA, 
for C and Si, and for SiC, GaAs, ZnS and ZnSe in 
the zincblende structure. 'Ishot' denotes the standard 
Ishot-GPF from the LDA including the Z factor, Eq. ([7]); 
we also show 'IshotNZ' starting from the LDA, but us- 
ing Eg. pi) with Z=l. As noted by ourselves and oth- 
ers IE B EO, S^l , standard 'Ishot' generally underesti- 
mates band gaps relative to experiment; for example, the 
difference between 'Ishot, 8x8x8' and LDA is 0.97-0.46 
eV for bandgap of Si. This difference is only ~60% of the 
correction 1.24 — 0.46 eV needed for the calculation to be 
exact. The systematic tendency to underestimate gaps in 
'Ishot' is also confirmed by a recent FP-LAPW GW cal- 
culation [4^ with huge basis sets. 'IshotNZ' shows better 
agreement with experiment; a justification for using Z~l 
was presented in Ref. pT| . However, this good agreement 
is somewhat fortuitous because of cancellation between 
two contributions to W: there is one contribution from 
the LDA bandgap underestimate, which overestimates 
n and underestimates W, and another from omission 
of excitonic effects which overestimates VF[ii|. In other 
words, 'Ishot' agreement with experiment would worsen 
if W were to include excitonic effects: the screening 
would be enhanced and result in smaller band gaps, 'e- 
only' denotes self-consistency in eigenvalues, while retain- 
ing LDA eigenfunctions (and charge density), 'mode- A' 
is QSGW^with Eq. ([10]), 'mode-B' with Eq. HI]), 'mode- 
A,8x8x8' is fc-converged to ~0.01 eV; see Fig. [31 

Tables [UlVI shows that 'e-only' results are closer to 
'IshotNZ' than 'Ishot'. We present results for a few 
materials here, but find that it appears to be true for 
a rather wide range of materials. We have already ar- 



gued this point in Ref. [1^1 ■ Mahan compares the two 
in the context of the Frolich Hamiltonian [S^. The 
Rayleigh-Schrodinger perturbation theory corresponds to 
'IshotNZ', which he argues is preferable to the Brillouin- 
Wigner perturbation which corresponds to 'Ishot'. 

'e-only' and 'mode-A' do not differ greatly in these 
semiconductors, because the LDA eigenfunctions are sim- 
ilar to the QSGFF eigenfunctions (this is evidenced by 
minimal differences in the LDA and QSGFF charge densi- 
ties). 'mode-A' band gaps are systematically larger than 
experiments, which we attribute to the omission of exci- 
tonic contributions to the dielectric function. 'mode-A' 
and 'modc-B' show only slight differences, which is indi- 
cator of the ambiguity in the QSGFF scheme due to the 
different possible choices for the off-diagonal contribu- 
tions to V^'^. This shows the ambiguity is not so serious 
a problem in practice. 

Fundamental gaps are precisely known; other high- 
lying indirect gaps are well known only in a few materials. 
Much better studied are direct transitions from measure- 
ments seen as critical points in the dielectric function or 
reflectivity measurements. We present some experimen- 
tal data for states at F, and also X and L. Agreement is 
generally very systematic: F-F and F-L transitions are 
usually overestimated by ~0.2 eV, and F-X transitions a 
by little less. We can expect valence band dispersions to 
be quite reliable in QSGM^; they are already reasonably 
good at the LDA level. Except for a handful of cases 
(GaAs, Si, Ge, and certain levels in other semiconduc- 
tors) there remains a significant uncertainty in the QP 
levels other than the minimum gap. In ZnSe, for exam- 
ple, the L point was inferred to_be 4.12 eV from a fast 
carrier dynamics measurement 
show ^0.2 eV variations in Ei 
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while optical data 
Lic-Ls'u; compare, for 
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TABLE II: QPE relative to top of valence for various kinds of GWA, in eV. Calculations use 6x6x6 k-points, except rows 
labeled '8x8x8'. k convergence can be estimated by comparing 'mode-A' and 'mode-A, 8x8x8' (see also Fig. 'Ishot' 
denotes standard Ishot-GW with H° = H^°^, Eq. We also show 'IshotNZ' for the one-shot case with Z=l. 'e-only' 
denotes eigenvalue-only self-consistency: eigenvalues are updated but LDA eigenfunctions are retained. 'mode-A' corresponds 
to QSGVF using Eq. (|10p . 'mode-B' to Eq. 'expt.-|-correction' adds to the experimental value contributions from spin- 



orbit and zero-point motion (referred to as 


'Adj' in Table III of Ref. [Til 


). Estimates for the latter are 


taken from Table III of 


Ref. o7i] . Experimental data taken from compilations 


in Refs. [58 


and 


m 


except where noted. 






C 


Lii, 


Lisc 


La'c 




X 


Ic 




Lie 


Lsc 


Eg 


LDA 


-21.32 


5.55 


13.55 


-6.29 


4.70 


-2.79 


9.00 


8.39 


4.09 


Ishot 


-22.24 


7.41 


14.89 


-6.69 


6.07 


-2.99 


10.37 


10.34 


5.51 


IshotNZ 


-22.60 


7.79 


15.26 


-6.80 


6.33 


-3.04 


10.67 


10.76 


5.77 


lshot,8x8x8 


-22.25 


7.38 


14.87 


-6.69 


6.04 


-2.99 


10.35 


10.32 


5.48 


lshotNZ,8x8x8 


-22.62 


7.76 


15.23 


-6.81 


6.30 


-3.05 


10.64 


10.73 


5.74 


e-only 


-23.03 


7.92 


15.72 


-6.92 


6.53 


-3.09 


10.96 


10.94 


5.94 


mode-A 


-23.05 


8.01 


15.73 


-6.89 


6.54 


-3.07 


10.96 


10.99 


5.97 


mode-A,8x8x8 


-23.06 


7.98 


15.70 


-6.90 


6.52 


-3.07 


10.93 


10.96 


5.94 


mode-B 


-23.05 


8.04 


15.79 


-6.89 


6.55 


-3.07 


10.98 


11.00 


5.97 


expt. 


-23.0'' 


7.14* 






6.08" 








5.5" 


expt. -I- correction 










6.45'' 








5.87'' 




Si 


Lii, 


Lisc 


La'c 




Xlc 




Lie 


Lac 


Es 


LDA 


-11.98 


2.52 


3.22 


-2.86 


0.59 


-1.20 


1.42 


3.29 


0.46 


Ishot 


-11.90 


3.14 


4.04 


-2.96 


1.12 


-1.25 


2.07 


3.91 


0.98 


IshotNZ 


-11.90 


3.34 


4.30 


-3.01 


1.27 


-1.28 


2.26 


4.11 


1.13 


lshot,8x8x8 


-11.89 


3.13 


4.02 


-2.96 


1.11 


-1.25 


2.05 


3.89 


0.97" 


lshotNZ,8x8x8 


-11.89 


3.32 


4.28 


-3.01 


1.25 


-1.27 


2.24 


4.09 


1.11 


e-only 


-12.17 


3.36 


4.30 


-3.05 


1.28 


-1.29 


2.26 


4.14 


1.14 


mode-A 


-12.20 


3.47 


4.40 


-3.05 


1.38 


-1.28 


2.36 


4.25 


1.25 


mode-A,8x8x8 


-12.19 


3.45 


4.38 


-3.05 


1.37 


-1.28 


2.35 


4.23 


1.23 


mode-B 


-12.21 


3.52 


4.49 


-3.04 


1.42 


-1.28 


2.42 


4.29 


1.28 


expt. 


-12.50 


3.35^ 


4.18 


-2.9 


1.32 


-1.2 


2.18^ 


4.10^ 


1.17 


expt. -I- correction 




















1.24" 



'^5.50 eV for fundamental gap on A line l62l . adding (calculated) 
difference 0.58 eV from min— >X. 
"^EP renormalization 0.37 eV [13 

'^Tfiis number was 0.95 eV in Ref. [Till because of differences in 
computational conditions. 
^EUipsometry 63]. Lc data assumes L3/„=— 1.28 eV). 
fCorrection: 0.01 cV (SO) + 0.06 (EP renormalization) |57| 



example Refs. [23 and [71[. Most of the photoemission 
data (used for occupied states in Tables ITlllIVP is also 
only reliable to a precision of a few tenths of an eV. 



Finally, when comparing to experiment, we must add 
corrections for spin-orbit coupling and electron-phonon 
(EP) renormalization of the gap owing to zero-point mo- 
tion of the nuclei. Neither arc included in the QP cal- 
culations, and both would reduce the QSGFFgaps. EP 
renormalizations have been tabulated for a wide range of 
semiconductors in Ref. [l^l ■ In some cases there appears 
to be some experimental uncertainty in how large it is. 
In Si for example, the EP renormalization was measured 
to be ~0.2 eV in Ref. [t^, while the studies of Cardona 
and coworkers put it much smaller [57| . 



D. ZnO and CU2O 



Data for the direct gap at F for wurzitc ZnO, and for 
CU2O (cuprite structure) are given in Table W\ In these 
materials, 'IshotNZ' (and more so 'Ishot') results are 
rather poor, as were ZnS and ZnSe (Table [IV| . Corre- 
sponding energy bands are shown in Figs. [4| and [6] We 
used 144 and 64 k points in the 1st BZ to make for 
ZnO and CU2O, respectively. ' IshotNZ (off-d)' denotes a 
Ishot calculation, including the off-diagonal elements of 
^yxc computed in 'mode- A'; i.e. bands were generated 
by i/LDA _|_ ^yxc -^viti-^Qut self-consistency. They differ 
little from standard 'IshotNZ' results in semiconductors, 
as we showed for variety of materials in Ref. [llj. Cal- 
culated dielectric constants £00 are shown in Table I VII 
Note its systematic underestimation. Data "With LFC" 
is the better calculation, as it includes the so-called local 
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TABLE III: See caption for Table HIl 



SiC 


^ LV 




J- Ic 


i ibC 








Ls^j 




Li. 


L<. 


LDA 


-15.31 




6.25 


7.18 


-3.20 


1.31 




1.06 




5.38 


7.13 


Ishot 


-15.79 




7.18 


8.54 


-3.43 


2.16 




1.14 




6.51 


8.41 


IshotNZ 


-16.01 




7.43 


8.89 


-3.51 


2.36 




1.17 




6.79 


8.73 


lshot,8x8x8 


-15.79 




7.16 


8.53 


-3.43 


2.14 




1.14 




6.49 


8.39 


lshotNZ,8x8x8 


-16.01 




7.42 


8.87 


-3.51 


2.34 




1.17 




6.77 


8.71 


e-only 


-16.33 




7.71 


9.09 


-3.56 


2.51 




1.18 




6.99 


8.95 


mode-A 


-16.35 




7.70 


9.13 


-3.58 


2.53 




1.18 




7.01 


8.97 


mode-A,8x8x8 


-16.35 




7.69 


9.12 


-3.57 


2.52 




1.18 




7.00 


8.96 


mode-B 


-16.34 




7.78 


9.18 


-3.58 


2.58 




1.18 




7.07 


9.01 


expt." 












2.39 














GaAs 


Ga 3d at T 


Fic 


Fl5c 










Lsu 


Lie 


Lsc 


LDA 


-14.87, 


-14.79 


0.34 


3.67 


-2.72 


1.32 


1.54 




-1.16 


0.86 


4.58 


Ishot 


-16.75, 


-16.70 


1.44 


4.30 


-2.86 


1.76 


2.11 




-1.22 


1.66 


5.14 


IshotNZ 


-17.64, 


-17.60 


1.75 


4.49 


-2.91 


1.88 


2.26 




-1.24 


1.88 


5.32 


lshot,8x8x8 


-16.75 , 


-16.69 


1.41 


4.27 


-2.86 


1.74 


2.08 




-1.23 


1.64 


5.12 


lshotNZ,8x8x8 


-17.65 , 


-17.61 


1.70 


4.46 


-2.91 


1.85 


2.23 




-1.25 


1.85 


5.29 


e-only 


-17.99, 


-17.97 


1.69 


4.58 


-2.94 


1.97 


2.32 




-1.26 


1.89 


5.44 


mode-A 


-18.13, 


-18.07 


1.97 


4.77 


-2.93 


2.15 


2.54 




-1.24 


2.14 


5.63 


mode-A,8x8x8 


-18.13, 


-18.07 


1.93 


4.74 


-2.93 


2.12 


2.51 




-1.25 


2.11 


5.60 


mode-B 


-18.10, 


-18.04 


2.03 


4.80 


-2.94 


2.17 


2.56 




-1.25 


2.18 


5.65 


expt. 


-18 


8" 


1.52'= 


4.51' 


-2.80" 


2.11" 






-1.30" 


1.78" 




expt. -I- correction 






1.69^ 



















"Ref. [gl 

'^Photoemission data, R ef. l65l (Ref. [6^ for Ga 3d levels) 
"^Ellipsometry from Ref. |67| . Lc and Xc assume L4 51,=— 1.25 eV, 
X7„=-3.01 eV (QSGW^ results with SO) 
.''Corrections include 0.11 eV (SO) + 0.06 eV(EP) 




mated. The discrepancy is large enough that 'IshotNZ,' 
which approximately corresponds to 'e-only' but neglect- 
ing changes in W^^^, is no longer a reasonable approx- 
imation. On the other band, Table |V] shows that the 
'e-only' and 'mode-A' are not so different (3.64 eV com- 
pared to 3.87 eV). This difference measures the contri- 
bution of the off-diagonal part to band gap; it is similar 
to ZnS. This modest difference suggests that the LDA 
eigenfunctions are still reasonable. However, there still 
remains a difficulty in disentangling d valence bands from 
the others in the 'e-only' case: topological connections 
in band dispersions can not be changed from the LDA 
topology, as we discussed in Ref. [ll[. 



FIG. 4: (Color online) Energy bands in wurzite ZnO. 
Sohd(Black) in left panehQS G W('mode-A') ; Broken(Green) 
in left:lshotNZ(off-d) ; Solid(Blue) in right panel:'e-only' ; 
Dotted(Red) in right panehLDA. 



field correction (See e.g. [21j). 



ZnO 



In contrast to cases in Sec. lIII Cl 'e-onlv' and 'IshotNZ' 
now show sizable differences. This is because the LDA 
gap is much too small, and W is significantly ovcresti- 



The imaginary part of the dielectric function, 
lme(q=0,a;), is calculated from the 'mode-A' potential 
and compared to the experimental function in Fig. O 
There is some discrepancy with experiment. Arnaud 
and Alouani [2l| calculated the excitonic contribution 
to Ime(a;) with the Bethe-Salpeter equation for several 
semiconductors. Generally speaking, such a contribution 
can shift peaks in Im e(q=0, uj) to lower energy and cre- 
ate new peaks just around the band edge. It is in fact 
just what is needed to correct the discrepancy with ex- 
periments as seen in Fig. [S] 
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TABLE IV: See caption for Table HIl 



/3-ZnS 


r, 

A It; 


Zn 3^/ at r 


r, 


LDA 


-13.10 


-fi 44 -H 95 


1.86 


Ishot 


-12.92 


-7 29 -fi 93 


3.23 


IshotNZ 


-12.92 


-7.74, -7.42 


3.59 


lshot,8x8x8 


-12.92 


-7.29 , -6.94 


3.21 


lshotNZ,8x8x8 


-12.93 


-7.75 , -7.42 


3.57 


e-only 


-13.52 


-8.14, -7.82 


3.83 


mode-A 


-13.50 


-8.33, -7.97 


4.06 


mode-A,8x8x8 


-13.50 


-8.33, -7.97 


4.04 


mode-B 


-13.53 


-8.31, -7.94 


4.13 


expt. 




-8.7" 


3.83* 


expt.+correction 






3.94" 



J- iDC 


Xr 


Xi 


Xd 




-L'lC 




6.22 


-2.23 


3.20 


3.89 


-0.87 


3.09 


6.76 


7 7*^ 
i . i o 






O.oD 


n Q9 
-u.yz 




o.zu 


8.15 


-2.39 


4.62 


5.76 


-0.94 


4.97 


8.61 


7.72 


-2.35 


4.32 


5.35 


-0.92 


4.57 


8.19 


8.14 


-2.39 


4.61 


5.74 


-0.94 


4.96 


8.59 


8.47 


-2.44 


4.91 


6.01 


-0.96 


5.22 


8.95 


8.69 


-2.43 


5.06 


6.24 


-0.95 


5.47 


9.14 


8.68 


-2.43 


5.05 


6.23 


-0.95 


5.45 


9.13 


8.75 


-2.44 


5.12 


6.30 


-0.95 


5.53 


9.20 



ZnSe 


Lii, 


Zn3d at F 


Fic 


LDA 


-13.32 


-6.64, -6.27 


1.05 


Ishot 


-13.15 


-7.66, -7.39 


2.31 


IshotNZ 


-13.09 


-8.17, -7.94 


2.63 


Ishot, 8x8x8 


-13.16 


-7.66, -7.39 


2.28 


lshotNZ,8x8x8 


-13.10 


-8.18, -7.94 


2.59 


e-only 


-13.61 


-8.52, -8.30 


2.79 


mode-A 


-13.64 


-8.63, -8.36 


3.11 


mode-A,8x8x8 


-13.65 


-8.63, -8.36 


3.08 


mode-B 


-13.65 


-8.65, -8.38 


3.16 


expt. 




-O.O'' 


2.82" 


expt.-|-correction 






3.00" 



Fisc 


Xg^ 


Xlc 


Xsc 


Lg^ 


Lie 


Lac 


5.71 


-2.21 


2.82 


3.32 


-0.88 


2.36 


6.30 


6.82 


-2.35 


3.62 


4.41 


-0.94 


3.57 


7.32 


7.14 


-2.40 


3.82 


4.70 


-0.96 


3.89 


7.62 


6.80 


-2.35 


3.60 


4.40 


-0.94 


3.56 


7.31 


7.12 


-2.40 


3.80 


4.68 


-0.96 


3.86 


7.60 


7.41 


-2.43 


4.08 


4.92 


-0.97 


4.08 


7.93 


7.69 


-2.42 


4.32 


5.20 


-0.96 


4.39 


8.19 


7.68 


-2.42 


4.30 


5.19 


-0.96 


4.38 


8.17 


7.72 


-2.42 


4.34 


5.23 


-0.96 


4.43 


8.21 




-2.1'* 


4.06^ 




-1.2'* 


3.96^ 





"Ref. 
''Ref. ii 



'^Corrections include 0.02 eV (SO) 



0.09 eV(EP) 

'^Ref. 
"Ref. [gi 

^Reflectivity from Ref. [tJ. Lc and Xc assume Yj^^^v- 
X7„=-2.47 eV (QSGVK results with SO) 
sCorrcctions include 0.13 eV (SO) + 0.05 eV(EP) [H 



-0.95 eV, 



TABLE V; Direct gap (eV) at F for ZnO and CU2O for 
kinds of GWA. For CU2O, Eg and Eo are the first and second 
minimum gap at F. 



Material 


Expt. 
+corr 


LDA 


Ishot 


IshotNZ 


IshotNZ 
(off-d) 


e-only 


mode-A 


ZnO 


3.60' 


0.71 


2.46 


2.88 


3.00 


3.64 


3.87 


CU2O Eg 


2.20' 


0.53 


1.51 


1.99 


1.95 


1.98 


2.36 


Eo 


2.58' 


1.29 


1.88 


1.97 


1.93 


2.32 


2.81 



"Values computed from 3.44-1-0.164 eV, 2.17-1-0.033 eV, and 
2.55-1-0.033 eV, where 0.164eV and 0.033eV are zero-point contri- 
butions; See Table III in Ref. ^ . The Ishot value 2.46 eV for ZnO 
is slightly different from a prior calculation (2.44 eV in Ref. [43 ) 
because of more precise computational conditions. TshotNZ(off-d)' 
denotes a Ishot calculation including the off-diagonal elements. 



TABLE VI: Optical dielectric constant e^D calculated in the 
RPA ^ from 'mode-A' Calculations were checked for 
fc-convergence; data shown used 3888 fc-points for ZnO and 
1444 points for the others. "With LFC" means including the 
local field correction (See e.g. [2H). 





no 


LFC with LFC Expt. 


ZnO(k//C-axis) 


3.2 


3.0 


3.75-4.0" 


CU2O 


5.9 


5.5 


6.46'' 


MnO 


3.9 


3.8 


4.95" 


NiO 


4.4 


4.3 


5.43;fe.7" 



"Ref. 
''Ref. Zi 
"Ref. 73, [zi 
''Ref. 



2. CU2O 

We have to distinguish between the QP density of 
states (QP-DOS), which is calculated from H'^ (the poles 
of G^), and the "spectrum" density-of-states, which is 
calculated from the poles of G. The QP-DOS is the im- 
portant quantity needed to describe the fundamental ex- 
citations in materials. QP-DOS in 'mode-A' are shown 



in Fig. [7| The absorption coefficient in RPA from is 
shown in Fig. [51 as well as the energy bands in Fig. [Sj 
This calculation includes Cu3p and Cu4d as VAL states 
using local orbitals. 

As shown in Table [Vl the discrepancy between "e-only" 
and "mode-A" fundamental gap is ~0.4 eV, rather signif- 
icant and larger than cases considered previously. This 
reflects an increased discrepancy between the LDA and 
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-with LFC 
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FIG. 5: (Color online) Imaginary part of dielectric function 
for ZnO. Local field corrections (LFC) only slightly affect the 
result. 
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FIG. 6: (Color online) Energy bands of CU2O. Dot- 
ted(Red):LDA ; Solid(Black):QSGiy('mode-A'). 
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FIG. 8: (Color online) Absorption coefficient for CU2O. Ex- 
periment is taken from Ref. (80]. 



QSG W eigenfunctions. The 'mode-A' energy bands can 
be compared with results by Bruneval et al. [8l|, who 
also performed QS G W calculations within a pseudopo- 
tcntial framework. The lowest and second gaps we ob- 
tain, £^3=2. 36 eV and i?o=2.81 eV, are somewhat larger 
than their values (1.97 eV and 2.27 eV). Further, the dif- 
ference between the peaks just below Ep and the main 
3d peak is 1.90 eV, which is slightly larger than what 
Bruneval et al. obtain (1.64 eV). This is the Dl-Fl dif- 
ference in Ref. [s^l, measured to be 1.94 eV. The ab- 
sorption coefficient shows essentially the same kinds of 
discrepancy with experiment as we saw for Ime(a;) in 
ZnO. Bruneval et al. calculated the excitonic contribu- 



tions for CU2O in a Bethe-Salpeter framework [8IJ, and 
showed that they account for most of the error in the 
RPA dielectric ftmction as computed by QSGW. 



E. NiO and MnO 
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0.2 

0.4 
0.2 




Energy(eV) ' 

FIG. 7: Partial DOS (QP-DOS) computed from the QSGW' 
H" for CU2O. Valence band maximum at zero. Right-hand 
scale is for unoccupied states and is 10 x the left-hand scale, 
which applies to occupied states. 



We described these compounds already in Ref. [l5|; 
here we present some additional analysis. We assume 
antiferro-II ordering [s^ and time-reversal symmetry 
(thus no orbital moments), with 64 k-points in the 1st 
BZ. In Fig. [HI we show QSGW 'mode-A' energy bands, 
comparing them to 'e-only' and LDA in the right panel. 
The problem with 'c-only' bands is now very apparent: 
the bandgap is much too small and the conduction band 
dispersions arc qualitatively wrong (minimum gap falls 
at the wrong point for NiO). Further, the valence bands 
(especially, the relative position of O 2p and TM 3d 
bands) changes little relative to the LDA. 'e-only' can not 
change the LDA's spin moment since eigenfunctions do 
not change; they are significantly underestimated. The 
QP-DOS are shown in Fig. (TUl As we detail below, the 
self-consistency is essential for these materials as was 
found by Aryasetiawan and Gunnarsson (l^ . 
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1. MnO 

The Mn Cg and O 2p QP-DOS show common peaks 
below E-p. This is because of the strong dda coupUn g be - 
tween Cg, mediated through the O 2p valence bands [83l |. 
The Cg components are separated mainly into two peaks 
which have comparable weight, in contradiction to the 
LDA case where the deeper peak has a very small weight. 
This is because Cg levels are pushed down relative to 
the LDA. The bottom panel compares XPS (occupied 
states) and BIS (unoccupied states) experiments with 
the total QP-DOS, broadened with 0.6 eV Gaussian. 
There is good agreement with the XPS part for the eg 
peak just below E-p^ for the t2g peak, and for the va- 
lence band width. (Based on a model analysis by Taka- 
hashi and Igarashi [sl], we expect that many-body ef- 
fects do not strongly perturb the QP-DOS.) However, 
there is discrepancy in the BIS part. A possible assign- 
ment is to identify a shoulder of total QP-DOS seen at ~ 
i^F+TeV as the peak of the BIS at ~ 6.8 eV (as claimed 
in Ref. [IBl)- Alternatively it is possible that the QP- 
DOS predicts conduction bands ^ 1.5eV higher than the 
BIS data. 

Fig. [TT] compares lme{uj) generated by QSGFF'mode- 
A' with experiment. The discrepancy looks too large to 
explain the difference between the 'mode-A' calculation 



FIG. 10: (Color online) QP-DOS('mode-A') for NiO and 
MnO for each atomic site. Contributions are decomposed 
majority and minority in Mn(Ni) site in top two panels. Fur- 
ther, it is decomposed into eg and t2g- Dotted(Red): eg\ 
Solid (Black) :t2c/. O (total) is the QP-DOS project onto an O 
site (sum of majority and minority). Dots(Red) in the bottom 
panels are taken from XPS-I-BIS experiments [s^-lssj. and are 
compared with total QP-DOS broadened with a 0.6 eV Gaus- 
sian. 



and the experimental data. However, if we neglect the 
difference in absolute value, we can say that 'mode-A' 
predicts the peak Ime(cj) at an energy too high by ~2 eV. 
This view is consistent with the conjecture below for NiO. 



2. NiO 

The majority-spin QP-DOS of NiO is roughly similar 
to that of MnO. However, the Cg state just below Ey is 
broadened and carries less weight (Fig.[TO|l. That peak is 
seen in MnO but is lost in this case, though the deeper Cg 
peak remains. The t2g DOS is widened relative to LDA. 
These features are also observed in other beyond-LDA 
calculations [s^, [Q^I ■ The top of valence consists of O 2p 
states, which hybridize with majority eg states, and mi- 
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FIG. 11: (Color online) Imaginary part of the dielectric func- 
tion in MnO and NiO. Effects of local fields (LFs) are signif- 
icant only in the deep energy region related to the 3p core 
(inset). LP effects contribute mainly to such local atomic-like 
excitations; they should be interpreted in terms of the Lorenz 
field in the Clausius-Mossotti relation. Local fields reduce the 
peaks of localized excitations, and moves them to higher en- 
ergy (this is already seen in rutile Ti02 [l^). Experimental 
data is taken from Ref. [§3] for MnO, and Ref. [zl] for NiO. 



nority <2g which weakly hybridize. In the bottom panel, 
the d DOS between Ep and E'p— 4eV falls in good agree- 
ment with XPS; but the valence DOS width differs from 
XPS, in contradistinction to MnO. The reason can be at- 
tributed to the satellite structure contained in the XPS 
data: e.g. Takahashi et al. predict that a satellite 
should appear around Ep—deV. 

Turning to the unoccupied states, we can see 'mode- 
A' puts a peak ~1.3 eV too high compared BIS peak at 
~i?F-|-5eV. On the other hand, Ime(cL') is in rather rea- 
sonable agreement with experiment except for a shift in 
the first peak by ^2.0 eV ('mode-A'). Thus we can dis- 
tinguish two kinds errors in bandgap: (1.3 eV in BIS, 
and 2.0 eV in Ime). We think both kinds of errors can be 
explained by the excitonic effects for W missing in our 
QSGiy calculation. This is consistent with the QSGW 
too being underestimated. Thus our conjecture is: if 
we properly include the excitonic contributions to W ^ 



weights in Ime will shift to lower energy, increasing erx,- 
Self-consistency with such W should reduce the band 
gap, simultaneously improving agreement with BIS and 
the dielectric function. 



F. Fe and Ni 

Fig. [T^ shows energy bands for Fe and Ni calculated 
by LDA, QSGW^ 'mode-A' and 'e-only'. The two qSGW 
calculations show similar d band shapes: their widths 
narrow relative to LDA, moving into closer agreement 
with experiment. On the other hand, the 'e-only' cal- 
culation significantly shifts the relative positions of the 
s and d levels, depressing the bottom of s band ~1 eV 
in contradiction to experiment. 'IshotNZ' results (not 
shown) are very similar to the 'e-only' case. This in- 
dicates the importance of the charge redistribution due 
to the off-diagonal part of Eq. (fTOl) in the 3d transition 
metals. Yamasaki and Fujiwara [4^ presented the 'Ishot' 
GWA results for Fe, Ni, and Cu, Aryasetiawan [9l| for 
Ni. Both calculations included the Z factor [Z k, 0.8 
for s band, Z w 0.6 for d band) thus the changes they 
found relative to LDA are not so large. Including the 
Z factor mostly eliminates the unwonted s band shift. 
However it does so apparently fortuitously. As 'Ishot' 
(and 'IshotNZ') should be taken as an approximation of 
'e-only', it is wrong to take 'Ishot' as a better theoretical 
prediction than 'e-only'. The calculated spin magnetic 
moments are listed in Ref. [l^ . Little difference with ex- 
periment is found for Fe (2.2 ^b)- For Ni, QSGW gives 
0.7/iB- a- little larger than the experimental value 0.6 /ie. 
This is reasonable because QSG Undoes not include the 
effect of spin fluctuations. 



IV. SUMMARY 

We showed that the QSGW^ formalism can be derived 
from a standpoint of self-consistent perturbation theory, 
where the unperturbed Hamiltonian is an optimized non- 
interacting one. Then we presented a means to calculate 
the total energy based on adiabatic connection, and how 
it relates to the density functional framework. 

We then showed a number of key points in the im- 
plementation of our all-electron GWA and QSGW. The 
mixed basis for W and the offset-F method are especially 
important technical points. We presented some conver- 
gence checks for the mixed basis, and some results in var- 
ious kinds of systems to demonstrate how well QSGM^ 
works. For insulators, QSGFF provides rather satisfac- 
tory description of valence bands; conduction bands are 
also well described though bandgaps are systematically 
overestimated slightly. We compared the difference of 
the QSGW results to eigenvalue-only self-consistent re- 
sults and to Ishot-GVF results in various cases. 

How well eigenvalue-only self-consistency works de- 
pends on how correlated the system is. In covalent 
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best of our knowledge, it is not clearly discussed in spite 
of its importance. Hereafter, W and 11 denote the dy- 
namical screened Coulomb interaction, and the proper 
polarization function without approximation; and H*' 
denote the same in RPA. It gets clearer that the well- 
balanced treatment between the vertex function F and G 
to respect "Z factor cancellation" is important; thus the 
so-called fuU self-consistent GW [2i,[33, 92, 93, 94] must 
be a problematic approximation. We will use symbolical 
notations hereafter for simplicity. 

As is well known, the exact self energy S is calculated 
from G as E = GWT G can be written as 



G = ZG" + G, 



(Al) 



L r xwrL r xwr 



where is the QP part of the Green's function, Z is 
the renomalization factor, and G is the incoherent part. 
The incoherent part contains physically unclear kinds of 
intermediate states, which are not always characterizable 
by a single-particle propagator. 

In (a) below, we consider the Z-factor cancellation in 
the calculation of E = GWT for given G, W, and F and 
in (b), the Z factor cancellation in 11. 



FIG. 12: (Color online) Energy bands for Fe and Ni. 
Solid(Black): QSGW'Cmode-A') ; Broken(Blue): 'e-only' ; 
Dotted(red): LDA. Calculations used 12x12x12 = 1728 k- 
points in the 1st BZ. Comparing to calculations at other k 
meshes 8x8x8 . . . 14x14x14, we estimate the numerical con- 
vergence is a little better than 0.1 eV. In this case con verg ence 
is limited by uncertainties in the determination Ef [5J]. In- 
spection of the bands at fine resolution show slight discon- 
tinuities at certain k points. These occur at times because 
of difficulties in the E interpolation; see Sec. Ill Gl and the 
smearing procedure in Sec. Ill El 



(a) In the integration of GWT, the most dominant part 
is related to the long range static part of W, the q ^ 
and Lu ^ limit of W{q,Lu). In this limit, the vertex 
function becomes 



(A2) 



semiconductors LDA eigenfunctions are rather good, and 
there is little difference. The error is no longer small in 
CU2O, Fe, and Ni; finally eigenvalue-only self-consistency 
fails qualitatively in NiO. 

We would like to thank F. Aryasetiawan for giving us 
his LMTO-ASA GW code, which formed the basis for de- 
velopment of the present one. This work was supported 
by the DARPA SPINS project, and by ONR contract 
N00014-02-1-1025. We thank the Fulton HPC for com- 
putational resources used in this project. 

APPENDIX A: HOW TO JUSTIFY G°W° 
APPROXIMATION FROM S[G]? — Z FACTOR 
CANCELLATION 

We explain the "Z factor cancellation" , which is one 
justification for so-called G^W° approximation. To the 



This is a Ward identity. Here we need to assume the 
insulator case. Then there is a cancellation between Z 
from ZG'^ and 1/Z from F. Under the assumption that 
G is rather structureless, GW may give almost state- 
independent contributions (may results in a little changes 
of chemical potential), thus GWT « G°VF is essentially 
satisfied. In the case of metal, there is an additional term 
in right-hand side of Eq. (|A2|) due to the exsitence of the 
Fermi surface; then we expect F > 1/Z, see e.g. [95l |: 
the point (b) below should be interpreted in the same 
manner. In any cases, we can claim the poorness of fully 
self-consistent GW which neglect F, as discussed in the 
foUwing. 



(b) W is given in terms of the proper polarization func- 
tion H as VF = u(l-uH)-i. H(l,4) = n(l,l;4,4), which 
is written as 
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n(l, 1'; 4, 4') = G2(l, 1'; 4, 4') + 62(1, 1'; 2, 2')/(2, 2'; 3, 3')n(3, 3'; 4, 4') 

= G2 + G2IG2 + G2IG2IG2 + . . . . 



(A3) 



r 



Here G2(l,l';2,2') = -iG(l, 2)G(2', 1'), and the dupli- 
cated indices are integrated (the Einstein sum rule). This 
looks like an electron-hole ladder diagram, but with the 
two-particle irreducible kernel J(l,l';2,2') playing the 
role of the steps of the ladder. G2 contains an electron- 
hole pair excitation \E'i(ri)5',*(r'j)5'*(r2)5'j(r2) x {nj — 
ni)S{uj — {si — Ej)) multipled by in its intermediate 
state (imaginary part), because G2 contains ZG° x ZG". 
Here Ei and '^i denote a QPE and QP cigcnfunction in- 
cluded in G", and Uj — rii is the occupation number dif- 
ference. 

Let us consider how much the pair excitation is included 
in n (i.e. how 11(1, 2; 3, 4) changes when a pair excita- 
tion is added or removed) in its intermediate states. This 
means take the derivative of n through G2 with respect 
to Uj — rii (derivative is not through /. Such contribution 
is not two-particle reducible, that is, not for the interme- 
diate states for H). 

We can show that 

Zin(l, 4) = r(l, 2, 2')Z\G2(2, 2'; 3, 3')r(4, 3', 3) 



;this is derived from Eq. (jA3P with paying attention to 
the matrix notation; r(l, 2, 2') = and r(4, 3', 3) = 

i_}iQ^ symbolically. Thus we see that the additional pair 
excitation (intermediate state) is included in 11 with the 
weight y X Z"^ x ^ — 1 for q "> 0,ti; — > 0, because 
of Eq. dSH) (factors l/Z come from T). This is the Z 
factor cancellation mechanism for 11. Bechstedt et al. 



demonstrated this in practice [3lJ at the lowest level of 
approximation. 

This suggests that n(l,2) « nO(l,2) = -iG° x G^ 
is a reasonable approximation because the derivative of 
n''(l, 2) apparently does not include any Z factor — thus 
the Z factor cancellation is trivially satisfied. 

In the above discussion, we use the fact that GWA is 
dominated by the long range part of W] we may expect 
such Z cancellation somehow occurs even for short-range 
W; but it may be less meaningful. The above discussion 
shows why the fully self-consistent GW method is a poor 
approximation. Because the vertex function is omitted, 
Z in G = ZG° + G is not canceled. E = must 
be a rigorous formula; however, the series expansion in 
G should be very inefhcient — it contains rather large 
cancellations between terms in the series so as to cancel 
out the effect of Z as seen in (a) and (b). 

On the other hand, the G^W^ approximation looks 
reasonable from the viewpoint of (a) and (b), because 
it includes contributions from QPs with correct weights. 



From the beginning, this is what we expect from the 
Landau-Silin QP picture. 

Z factor cancellation is generally important. For exam- 
ple, the Bethe-Salpeter equation (BSE) can be described 
as the sum of the ladder diagrams; if G is used instead 
of G^ in the sum, it should give a similarly poor result. 

To summarize, it looks more reasonable to calculate 
S[G] through the QP part G° contained in G. That is, 
G ^ G° E, where we can use G^W'^ approximation 
for G° ^ E. From this S, we can calculate a new G; this 
suggest the self-consistency cycle G ^ G° G ^ G'^ .... 
The problem is how to extract G° from G; the QSGH^ 
method gives a (nearly) optimal prescription. 

Mahan and Sernelius [3J| also emphasized the balanced 
treatment of the vertex function and W . Their work 
however, is not directly related to the discussion here. 
Their calculation is not based on GWT; instead they use 
G*^, and their vertex at q is not l/Z, but unity. Their 
formula is based on the derivative of the G°-based total 
energy with respect to the occupation number [96j . Their 
vertex function is identified as the correction to modify 
W into the effective interaction between a test charge 
and a QP. Their formula (or originally from Quinn et al. 
[9^) is related to the discussion in Appendix IbI 



APPENDIX B: CAN WE DETERMINE G BY 
TOTAL ENERGY MINIMIZATION? 



The RPA total energy Eq. ([22]) can be taken as a func- 
tional of V°^{v, r'): E^^^ depends on y^(r, r') through 
G°. Note that the HE part of the total energy docs not 
explicitly include QPEs (eigenvalues of H^), but E^'^'^^ 
does include them. 

In contradiction to the local potential case as, e.g. in 
the Kohn-Sham construction of DFT, it is meaningless to 
minimize E^^^ with respect to y^(r, r'). y*^(r, r') con- 
tains degree of freedom that can shift QPEs while keeping 
the eigenfunctions fixed (This is realized by adding a po- 
tential propotional to ilJi{T)ilj*{r')). Thus it is possible to 
change only i<;<=.RPA ]-,y varying V°^[r,r') in such a way 
that QPEs change but not eigenfunctions. This implies 
that E^'^^^ can be infinite (no lower bound) when all 
QPEs arc moved to the Fermi energy. 

On the other hand, it is possible to determine a QPE 
from the functional derivative of E^'^^ with respect to 
the occupancy of a state ^i. It gives the QPE as £i = 
'^g^ [96j . This is in agreement with QPE calculated by 
GW^A starting from . Thus, under the fixed QP eigen- 
fimctions, we can determine QPEs self-consistently; we 
use these in E^^^ and take its derivative with respect 
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to ^'i to determine the next Si — this is repeated until 
converged. This is nothing but the eigenvalue-only self- 
consistent scheme. QPE are not determined by the total 
energy minimization: that is QPEs are not variational 
parameters. Nevertheless it is a self-consistency condi- 
tion (consistency for the excitations around the ground 
state) and it is meaningful. 

Therefore, we can calculate E^^^ for a given complete 
set of QP eigenfunctions, where QPEs are made self- 
consistent in the manner above. It will be possible to 



minimize this E^^^ with respect to the set of QP eigen- 
functions. However, such a formalism looks too compli- 
cated. Further, only the occupied QP eigenfunctions are 
included in the Hartree-Fock part of E^^^; thus, conti- 
nuity (smoothness) from the occupied eigenfunctions to 
unoccupied eigenfunctions will be lost. Thus we think 
that it is better to choose another possibility, namely to 
determine not only QPEs but also the QP eigenfunctions 
in the self-consistency cycle, as we do in QS GW. 
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